1 Protocol and SAP versions

This Complete Study Report (CSR) v1.0 is based on Protocol v1.3, dated 5 January 2022 and approved by NHSRC and Statistical Analysis Plan (SAP) v1.1, dated 6 October 2022.

2 Protocol summary

2.1 Study design

A double-blinded randomised controlled trial of adult healthy human participants experimentally exposed to escalating doses of Streptococcus pneumoniae in the nasopharynx. The intervention vaccine will be PCV-13 (Prevnar-13) and the control vaccine will be 0.9% saline.

2.2 Study Endpoints

Primary endpoint: Detection of the inoculated pneumococci by classical culture methods, at any time point, from nasal wash recovered from the participants at days 2, 7 and 14 following the initial pneumococcal challenge.

Secondary endpoints: Pneumococcal carriage density and duration will also be measured to inform the primary endpoint. Innate, humoral and cellular responses to pneumococcal colonisation will be assessed by immunological assays on collected samples. These data will allow us to define the host variables that predict colonisation and protection due to PCV-13 vaccination.

2.3 Objectives

Main objective

Determine if PCV-13 vaccination is protective against pneumococcal carriage in healthy adult Malawian volunteers.

Secondary objectives

  1. Determine how pneumococcal dose influences carriage in PCV-13 vaccinated adults.
  2. Determine PCV-13 protection against pneumococcal re-challenge 12-months post vaccination.
  3. Determine the protective effect of prior pneumococcal carriage in Malawian volunteers.
  4. Examine local and systemic innate, humoral and cellular responses to PCV-13 with and without pneumococcal nasal carriage.
  5. Explore participant experience in the study to monitor acceptability.

3 Load data

Data has been downloaded from the MLW data portal after database lock on 7 October 2022 and after the randomisation list had been uploaded to the portal (i.e. after unblinding).

fileScreen<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvels_PCV13_screening_raw.csv"
fileFollowUp<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvels_PCV13_followup_raw.csv"
fileNasalWash<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvelspcv13_nasal_wash_raw_IDSamended20221013.csv"
fileInoculumDose<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvelspcv13_ino_report_raw.csv"
fileInoculum<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvelspcv13_inoculum_labb_raw.csv"

fileCarriage<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvels_pcv13_carriageData_2022-11-01.csv"
fileCarriageSerotype<-"../Data/20221007_AfterDatabaseLock_Unblinded/carriage_20221013.csv"

fileRandomisation<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvelspcv13_unblinding_raw_full.csv"

fileWithdrawals<-"../Data/20221007_AfterDatabaseLock_Unblinded/marvelspcv13_withdraw_raw.csv"
fileEligible<-"../Data/20221007_AfterDatabaseLock_Unblinded/CONSORT_20220915.csv" # outdated

fileLytAVisitE<-"../Data/20221129_LytAPCR/20221129 - Visit E_Final_Clean_Manuscript_Tar.csv"
fileLytAVisitF<-"../Data/20221129_LytAPCR/20221129 - Visit F_Final_Clean_Manuscript_Tar.csv"
fileLytAVisitG<-"../Data/20221129_LytAPCR/20221129 - Visit G_Final_Clean_Manuscript_Tar.csv"

fileLytADensity6B<-"../Data/20230123_LytAPCRDensity/MARVELS PCV13 Experimental carriage Density Calculation.csv"

datScreen<-read.csv(fileScreen)
datFollowUp<-read.csv(fileFollowUp)
datNasalWash<-read.csv(fileNasalWash)
datInoculumDose<-read.csv(fileInoculumDose)
datInoculum<-read.csv(fileInoculum)
datWithdrawals<-read.csv(fileWithdrawals)
datEligible<-read.csv(fileEligible) # outdated

datCarriageDates<-read.csv(fileCarriage)
datCarriageDates$Carrier6B[datCarriageDates$Carrier6B==""]<-"n"
datCarriageDates[datCarriageDates==""]<-NA

datCarriageSerotype<-read.csv(fileCarriageSerotype)

# remove PIDs CHIMB1720 and CHIMB2538 as these tested Covid-19 positive at visit B but should be part of analysis population since Covid-19 negative at visits C and later
datWithdrawals<-datWithdrawals %>% dplyr::filter(!(pid %in% c("CHIMB1720","CHIMB2538")))

pidsComplete<-datCarriageDates$pid[!is.na(datCarriageDates$Carriage.E) & !is.na(datCarriageDates$Carriage.F) & !is.na(datCarriageDates$Carriage.G) & datCarriageDates$Covid19Positive==0 & !(datCarriageDates$pid %in% datWithdrawals$pid)]
datCarriageAll<-datCarriageDates
datCarriageDates<-datCarriageDates[datCarriageDates$pid %in% pidsComplete,]

pids20<-datFollowUp$pid[grepl(pattern="20000",datFollowUp$vial_verified)]
pids80<-datFollowUp$pid[grepl(pattern="80000",datFollowUp$vial_verified)]
pids160<-datFollowUp$pid[grepl(pattern="160000",datFollowUp$vial_verified)]
pidsCovid<-datCarriageAll$pid[datCarriageAll$Covid19Positive==1]
pidsWithdrawn<-datWithdrawals$pid

datCarriageSerotype<-datCarriageSerotype %>%
  dplyr::mutate(
    carriage=case_when(
      Carriage=="6B"~"6B",
      Carriage=="NVT"~"Natural",
      TRUE~NA_character_
    ),
    carriageWithVT=case_when(
      SEROTYPE=="6B"~"6B",
      SEROTYPE=="NVT"~"Natural NVT",
      SEROTYPE %in% setdiff(unique(datCarriageSerotype$SEROTYPE),c("6B","NVT"))~"Natural VT",
      TRUE~NA_character_
    ),
    serotype=SEROTYPE,
    dose=paste(sep="","CFU ",gsub(pattern=",",replacement="",gsub(pattern=" CFU",replacement="",CFU)))
  ) %>%
  dplyr::select(pid,dose,Visit,carriage,carriageWithVT,serotype)

missingPids<-setdiff(pidsComplete,datCarriageSerotype$pid)
missingCarriage<-data.frame(pid=missingPids) %>%
  dplyr::mutate(
    dose=case_when(
      pid %in% pids20 ~ "CFU 20000",
      pid %in% pids80 ~ "CFU 80000",
      pid %in% pids160 ~ "CFU 160000",
      TRUE~NA_character_
    ),
    Visit="B",
    carriage=NA,
    carriageWithVT=NA,
    serotype=NA
)

datCarriageSerotype<-rbind(datCarriageSerotype,missingCarriage)
  
datCarriage<-datCarriageSerotype %>%
  dplyr::filter(pid %in% pidsComplete) %>%
  tidyr::pivot_wider(id_cols=c(pid,dose),names_from=Visit,values_from=c(carriage,serotype)) %>%
  dplyr::mutate(
    carriageVisitB=ifelse(!is.na(carriage_B) & carriage_B=="6B",1,0),
    carriageVisitC=ifelse(!is.na(carriage_C) & carriage_C=="6B",1,0),
    carriageVisitE=ifelse(!is.na(carriage_E) & carriage_E=="6B",1,0),
    carriageVisitF=ifelse(!is.na(carriage_F) & carriage_F=="6B",1,0),
    carriageVisitG=ifelse(!is.na(carriage_G) & carriage_G=="6B",1,0),
    carriageNot6BVisitB=ifelse(!is.na(carriage_B) & carriage_B=="Natural",1,0),
    carriageNot6BVisitC=ifelse(!is.na(carriage_C) & carriage_C=="Natural",1,0),
    carriageNot6BVisitE=ifelse(!is.na(carriage_E) & carriage_E=="Natural",1,0),
    carriageNot6BVisitF=ifelse(!is.na(carriage_F) & carriage_F=="Natural",1,0),
    carriageNot6BVisitG=ifelse(!is.na(carriage_G) & carriage_G=="Natural",1,0)
  ) %>%
  dplyr::mutate(
    Carrier6B=ifelse(carriageVisitB==1 |carriageVisitC==1 | carriageVisitE==1 | carriageVisitF==1 | carriageVisitG==1,1,0),
    Covid19Positive=ifelse(pid %in% datCarriageDates$pid[datCarriageDates$Covid19Positive==1],1,0)
  )

datRandomisation<-read.csv(fileRandomisation)
Number of nasal wash samples acquired per visit as of 07 October 2022.

Figure 3.1: Number of nasal wash samples acquired per visit as of 07 October 2022.

CFU concentrations over the study period.

Figure 3.2: CFU concentrations over the study period.

datInoculum %>%
  mutate(
    date=date(dmy_hms(start)),
    dose=factor(levels=c("CFU 20000","CFU 80000","CFU 160000"),paste(sep="","CFU ",gsub(pattern="6B-[A-Z][0-9]+-",replacement="",vial)))
    ) %>%
  pivot_longer(cols=contains("cfu"),names_to="type",values_to="cfu") %>%
  mutate(type=gsub(pattern="_",replacement="",gsub(type,pattern="cfu",replacement=""))) %>%
  dplyr::select(date,dose,cfu,type) %>%
  dplyr::filter(type=="average") %>%
  dplyr::select(!type) %>%
  dplyr::group_by(dose) %>%
  dplyr::summarise(
    meanCFU=round(mean(cfu)),
    sdCFU=round(sd(cfu)),
    medianCFU=round(median(cfu)),
    iqrCFU=paste(sep="","(",paste(collapse=",",round(quantile(cfu,probs=c(0.25,0.75)))),")")
  ) %>%
  knitr::kable(col.names=c("Target concentration","Mean","Std. deviation","Median","IQR"),caption="Mean, standard deviation, median and inter-quartile range for the inoculum concentration (average of the before and after inoculation measurements).") %>%
  kableExtra::kable_styling(full_width = FALSE)
Table 3.1: Mean, standard deviation, median and inter-quartile range for the inoculum concentration (average of the before and after inoculation measurements).
Target concentration Mean Std. deviation Median IQR
CFU 20000 19250 2864 19250 (17750,21333)
CFU 80000 74500 13890 74167 (63333,79583)
CFU 160000 154138 32574 155000 (133334,165000)

3.1 Current pneumococcal 6B carriage in MARVELS PCV13

We want to establish how many individuals have been vaccinated, inoculated and have completed all study visits (up to visit G). Of these we want to know how many are 6B carriers.

pids20Comp<-intersect(pids20,pidsComplete)
pids80Comp<-intersect(pids80,pidsComplete)
pids160Comp<-intersect(pids160,pidsComplete)

tabSummary<-data.frame(
  CFU20000=c(length(pids20),sum(!is.na(datCarriageDates$Carriage.G[datCarriageDates$pid %in% pids20Comp])),sum((datCarriageDates$pid %in% pids20Comp) & !is.na(datCarriageDates$Carriage.G) & datCarriageDates$Carrier6B=="y")),
  CFU80000=c(length(pids80),sum(!is.na(datCarriageDates$Carriage.G[datCarriageDates$pid %in% pids80Comp])),sum((datCarriageDates$pid %in% pids80Comp) & !is.na(datCarriageDates$Carriage.G) & datCarriageDates$Carrier6B=="y")),
  CFU160000=c(length(pids160),sum(!is.na(datCarriageDates$Carriage.G[datCarriageDates$pid %in% pids160Comp])),sum((datCarriageDates$pid %in% pids160Comp) & !is.na(datCarriageDates$Carriage.G) & datCarriageDates$Carrier6B=="y"))
)

rownames(tabSummary)<-c("Inoculated","Visit G completed (and Covid-19 negative)","6B carrier")

tabSummary %>%
  knitr::kable(col.names=c("CFU 20,000","CFU 80,000","CFU 160,000")) %>%
  kableExtra::kable_styling(full_width = FALSE)
CFU 20,000 CFU 80,000 CFU 160,000
Inoculated 42 81 98
Visit G completed (and Covid-19 negative) 40 74 90
6B carrier 2 18 20

Out of a total of 40 participants from the CFU 20,000 arm that completed visit G, 2 are serotype 6B carriers; this corresponds to a carriage rate of 5%.

Out of a total of 74 participants from the CFU 80,000 arm that completed visit G, 18 are serotype 6B carriers; this corresponds to a carriage rate of 24%.

Out of a total of 90 participants from the CFU 160,000 arm that completed visit G, 20 are serotype 6B carriers; this corresponds to a carriage rate of 22%.

datDensityVisit<-datFollowUp %>%
  dplyr::select(c(pid,visit,wash)) %>%
  dplyr::mutate(visit=paste(sep="","Visit",visit)) %>%
  dplyr::mutate(density=datNasalWash$density[match(wash,datNasalWash$wash_id)]) %>%
  tidyr::pivot_wider(id_cols=c(pid),names_from=c(visit),values_from=c(wash,density),names_sep="") %>%
  dplyr::select(!c(washVisitD,densityVisitD)) %>%
  dplyr::mutate(
    densityVisitC=ifelse(densityVisitC<0,0,densityVisitC),
    densityVisitE=ifelse(densityVisitE<0,0,densityVisitE),
    densityVisitF=ifelse(densityVisitF<0,0,densityVisitF),
    densityVisitG=ifelse(densityVisitG<0,0,densityVisitG)
  )

# Setting up the required data frame
simDat<-data.frame(
  pid=c(pids20,pids80,pids160),
  doseGroup=c(rep("CFU 20000",length(pids20)),rep("CFU 80000",length(pids80)),rep("CFU 160000",length(pids160)))
) %>%
  dplyr::mutate(
    dose=as.integer(gsub(pattern="CFU ",replacement="",doseGroup)),
    sex=ifelse(datScreen$gender[match(pid,datScreen$pid)]==1,"Male","Female"),
    age=datScreen$age[match(pid,datScreen$pid)],
    covid19=ifelse(is.element(pid,pidsCovid),1,0),
    VaccName=factor(datRandomisation$vaccname[match(pid,datRandomisation$pid)],levels=c("Saline","PCV-13")),
    carriage=ifelse(pid %in% datCarriage$pid[datCarriage$Carrier6B==1],1,0),
    visitB_date=dmy(datScreen$data_date[match(pid,datScreen$pid)]),
    visitC_date=dmy(datFollowUp$data_date[match(paste(sep="_",pid,"C"),paste(sep="_",datFollowUp$pid,datFollowUp$visit))]),
    visitD_date=dmy(datFollowUp$data_date[match(paste(sep="_",pid,"D"),paste(sep="_",datFollowUp$pid,datFollowUp$visit))]),
    visitE_date=dmy(datFollowUp$data_date[match(paste(sep="_",pid,"E"),paste(sep="_",datFollowUp$pid,datFollowUp$visit))]),
    visitF_date=dmy(datFollowUp$data_date[match(paste(sep="_",pid,"F"),paste(sep="_",datFollowUp$pid,datFollowUp$visit))]),
    visitG_date=dmy(datFollowUp$data_date[match(paste(sep="_",pid,"G"),paste(sep="_",datFollowUp$pid,datFollowUp$visit))]),
    carriageVisitB=datCarriage$carriageVisitB[match(pid,datCarriage$pid)],
    carriageVisitC=datCarriage$carriageVisitC[match(pid,datCarriage$pid)],
    carriageVisitE=datCarriage$carriageVisitE[match(pid,datCarriage$pid)],
    carriageVisitF=datCarriage$carriageVisitF[match(pid,datCarriage$pid)],
    carriageVisitG=datCarriage$carriageVisitG[match(pid,datCarriage$pid)],
    carriageNot6BVisitB=datCarriage$carriageNot6BVisitB[match(pid,datCarriage$pid)],
    carriageNot6BVisitC=datCarriage$carriageNot6BVisitC[match(pid,datCarriage$pid)],
    carriageNot6BVisitE=datCarriage$carriageNot6BVisitE[match(pid,datCarriage$pid)],
    carriageNot6BVisitF=datCarriage$carriageNot6BVisitF[match(pid,datCarriage$pid)],
    carriageNot6BVisitG=datCarriage$carriageNot6BVisitG[match(pid,datCarriage$pid)],
    densityVisitC=datDensityVisit$densityVisitC[match(pid,datDensityVisit$pid)],
    densityVisitE=datDensityVisit$densityVisitE[match(pid,datDensityVisit$pid)],
    densityVisitF=datDensityVisit$densityVisitF[match(pid,datDensityVisit$pid)],
    densityVisitG=datDensityVisit$densityVisitG[match(pid,datDensityVisit$pid)]) %>%
  dplyr::mutate(
    carriageNot6B=ifelse(carriageNot6BVisitB==1 | carriageNot6BVisitC==1 | carriageNot6BVisitE==1 | carriageNot6BVisitF==1 | carriageNot6BVisitG==1,1,0),
    density6BVisitC=0,
    density6BVisitE=ifelse(carriageVisitE==1,densityVisitE,0),
    density6BVisitF=ifelse(carriageVisitF==1,densityVisitF,0),
    density6BVisitG=ifelse(carriageVisitG==1,densityVisitG,0),
    densityNot6BVisitC=ifelse(carriageNot6BVisitC==1,densityVisitC,0),
    densityNot6BVisitE=ifelse(carriageNot6BVisitE==1,densityVisitE,0),
    densityNot6BVisitF=ifelse(carriageNot6BVisitF==1,densityVisitF,0),
    densityNot6BVisitG=ifelse(carriageNot6BVisitG==1,densityVisitG,0),
    carriageSerotypeVisitB=case_when(carriageNot6BVisitB==1~"Natural",TRUE~NA_character_),
    carriageSerotypeVisitC=case_when(carriageNot6BVisitC==1~"Natural",TRUE~NA_character_),
    carriageSerotypeVisitE=case_when(carriageNot6BVisitE==0 & carriageVisitE==1~"6B",carriageNot6BVisitE==1 & carriageVisitE==0~"Natural",carriageNot6BVisitE==1 & carriageVisitE==1~"Both",TRUE~NA_character_),
    carriageSerotypeVisitF=case_when(carriageNot6BVisitF==0 & carriageVisitF==1~"6B",carriageNot6BVisitF==1 & carriageVisitF==0~"Natural",carriageNot6BVisitF==1 & carriageVisitF==1~"Both",TRUE~NA_character_),
    carriageSerotypeVisitG=case_when(carriageNot6BVisitG==0 & carriageVisitG==1~"6B",carriageNot6BVisitG==1 & carriageVisitG==0~"Natural",carriageNot6BVisitG==1 & carriageVisitG==1~"Both",TRUE~NA_character_)
  ) %>%
  dplyr::mutate(
    BMI=(datScreen$weight/(datScreen$height^2))[match(pid,datScreen$pid)],
    smoking=case_when(
      datScreen$smoking==1~"never smoker",
      datScreen$smoking %in% c(4,5,6) ~ "past smoker",
      datScreen$smoking %in% c(2,3) ~ "current smoker"
    )[match(pid,datScreen$pid)],
    vaccInoDose=factor(paste(sep=" ",VaccName,format(dose,big.mark=",",trim=TRUE)),levels=c("Saline 20,000","Saline 80,000","Saline 160,000","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000"))
  )

simDat<-simDat %>%
  dplyr::mutate(
    isComplete=ifelse(pid %in% pidsComplete,1,0),
    gotCovid=ifelse(pid %in% datCarriageAll$pid[datCarriageAll$Covid19Positive==1],1,0)
  )

simDatAll<-simDat

simDat<-simDat %>%
  dplyr::filter(isComplete==1) %>%
  dplyr::filter(gotCovid==0)

4 Analysis

4.1 CONSORT diagram and participant characteristics

The very first analysis we will do will be descriptive: we will summarise the screening, recruitment, randomisation to vaccination arms, complete follow-up numbers using a standard CONSORT flow diagram as illustrated on Figure 4.1.

datVial<-datFollowUp %>%
  dplyr::filter(visit=="D") %>%
  dplyr::select(c(pid,data_date,vial_verified))

dateSwitch80<-min(ymd(simDat$visitB_date[simDat$dose==80000]))
dateSwitch160<-min(ymd(simDat$visitB_date[simDat$dose==160000]))
dateSwitch80<-max(dmy(datScreen$data_date[grepl(pattern="-020-",datScreen$rid_verified)]))+1
dateSwitch160<-min(dmy(datScreen$data_date[grepl(pattern="-160-",datScreen$rid_verified)]))

idx20<-which(dmy(datScreen$data_date)<dateSwitch80)
idx80<-which(dmy(datScreen$data_date)>=dateSwitch80 & dmy(datScreen$data_date)<dateSwitch160)
idx160<-which(dmy(datScreen$data_date)>=dateSwitch160)

datScreen<-datScreen %>%
  dplyr::mutate(
    dose=simDatAll$dose[match(pid,simDatAll$pid)],
    # dose=case_when(
    #   dmy(datScreen$data_date)<dateSwitch80~"CFU 20000",
    #   dmy(datScreen$data_date)>=dateSwitch80 & dmy(datScreen$data_date)<dateSwitch160~"CFU 80000",
    #   dmy(datScreen$data_date)>=dateSwitch160~"CFU 160000",
    #   TRUE~NA_character_
    # ),
    randomised=ifelse(!is.na(rid_verified) & rid_verified!="",1,0),
    studyArm=simDatAll$VaccName[match(pid,simDatAll$pid)],
    readEnglishOrChichewa=ifelse((!is.na(english) & english==1) | (!is.na(chichewa) & chichewa==1),1,0),
    ageUnknownorBelow18orAbove40=ifelse(is.na(age) | age<18 | age>40,1,0),
    closeContactVulnerable=ifelse((!is.na(under5) & under5==1) | (!is.na(over65) & over65==1),1,0)
  )

table(datScreen$randomised,datScreen$vac_done,useNA="ifany") # all good; 278 randomised & vaccinated

table(datScreen$studyArm,useNA="ifany")

table(datScreen$dose,useNA="ifany")

table(datScreen$dose,datScreen$randomised,useNA="ifany")

table(simDatAll$VaccName,simDatAll$dose)

pidsWithdrawn<-setdiff(simDatAll$pid,simDat$pid)

datCarriageAll<-datCarriageAll %>%
  dplyr::mutate(
    VaccName=datRandomisation$vaccname[match(pid,datRandomisation$pid)],
    dose=simDatAll$dose[match(pid,simDatAll$pid)],
    CompletesVisitE=ifelse(!is.na(Carriage.E) & Carriage.E!="",1,0),
    CompletesVisitF=ifelse(!is.na(Carriage.E) & Carriage.E!="" & !is.na(Carriage.F) & Carriage.F!="",1,0),
    CompletesVisitG=ifelse(!is.na(Carriage.E) & Carriage.E!="" & !is.na(Carriage.F) & Carriage.F!="" & !is.na(Carriage.G) & Carriage.G!="",1,0),
    CompletesVisitGwithoutCovid=ifelse(!is.na(Carriage.E) & Carriage.E!="" & !is.na(Carriage.F) & Carriage.F!="" & !is.na(Carriage.G) & Carriage.G!="" & Covid19Positive==0,1,0)
  )

table(datCarriageAll$dose,datCarriageAll$VaccName,datCarriageAll$CompletesVisitE,useNA="ifany")
table(datCarriageAll$dose,datCarriageAll$VaccName,datCarriageAll$CompletesVisitF,useNA="ifany")
table(datCarriageAll$dose,datCarriageAll$VaccName,datCarriageAll$CompletesVisitG,useNA="ifany")

pidWithdrew_pcv13_20_visitG<-unlist(datCarriageAll %>% dplyr::filter(datCarriageAll$VaccName=="PCV-13" & datCarriageAll$dose==20000 & datCarriageAll$CompletesVisitG==0) %>% dplyr::filter(!is.na(pid)) %>% dplyr::select(pid))
pidWithdrew_saline_20_visitG<-unlist(datCarriageAll %>% dplyr::filter(datCarriageAll$VaccName=="Saline" & datCarriageAll$dose==20000 & datCarriageAll$CompletesVisitG==0) %>% dplyr::filter(!is.na(pid)) %>% dplyr::select(pid))
pidWithdrew_pcv13_80_visitG<-unlist(datCarriageAll %>% dplyr::filter(datCarriageAll$VaccName=="PCV-13" & datCarriageAll$dose==80000 & datCarriageAll$CompletesVisitG==0) %>% dplyr::filter(!is.na(pid)) %>% dplyr::select(pid))
pidWithdrew_saline_80_visitG<-unlist(datCarriageAll %>% dplyr::filter(datCarriageAll$VaccName=="Saline" & datCarriageAll$dose==80000 & datCarriageAll$CompletesVisitG==0) %>% dplyr::filter(!is.na(pid)) %>% dplyr::select(pid))
pidWithdrew_pcv13_160_visitG<-unlist(datCarriageAll %>% dplyr::filter(datCarriageAll$VaccName=="PCV-13" & datCarriageAll$dose==160000 & datCarriageAll$CompletesVisitG==0) %>% dplyr::filter(!is.na(pid)) %>% dplyr::select(pid))
pidWithdrew_saline_160_visitG<-unlist(datCarriageAll %>% dplyr::filter(datCarriageAll$VaccName=="Saline" & datCarriageAll$dose==160000 & datCarriageAll$CompletesVisitG==0) %>% dplyr::filter(!is.na(pid)) %>% dplyr::select(pid))


table(simDat$VaccName,simDat$dose)

table(datScreen$randomised,datScreen$w_consent,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$ageUnknownorBelow18orAbove40,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$readEnglishOrChichewa,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$hearingissuephone,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$location,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$hiv,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$pcv,datScreen$dose,useNA="ifany")
table(datScreen$randomised,datScreen$closeContactVulnerable,datScreen$dose,useNA="ifany")

nrow(datScreen)
table(datScreen$dos)
include_graphics("../Data/20221007_AfterDatabaseLock_Unblinded/CONSORT/MARVELS_PCV13_ConsortDiagram_withTotals_Phase1Only_newFormat_20221101.png")
CONSORT diagram.

Figure 4.1: CONSORT diagram.

tab<-table(simDat$VaccName,simDat$carriage,simDat$dose)
tabFull<-as.data.frame(rbind(cbind(20000,tab[,,1]),cbind(80000,tab[,,2]),cbind(160000,tab[,,3])))

tabFull<-tabFull %>%
  mutate(
    arm=case_when(
      grepl(row.names(tabFull),pattern="PCV")~"PCV-13",
      grepl(row.names(tabFull),pattern="Saline")~"Saline"
    )
  )

colnames(tabFull)<-c("dose","noCarriage","carriage","arm")
tabFull<-tabFull[,c("arm","dose","noCarriage","carriage")]

We also list participant characteristics (sex, age, BMI) and screening visit measurements (percentage of participant with current natural carriage by S. pneumoniae other than serotype 6B, full blood count results, cytokine levels) by study arm and by inoculation dose (Table 4.1).

sumFun<-function(.data){
  .data %>%
    dplyr::summarise(n=n(),
                     sexMale=sum(na.rm=T,sex=="Male"),
                     sexMaleProp=sum(na.rm=T,sex=="Male")/sum(!is.na(sex)),
                     ageMedian=median(na.rm=T,age),
                     age25th=quantile(na.rm=T,age,probs=0.25),
                     age75th=quantile(na.rm=T,age,probs=0.75),
                     BMIMedian=median(na.rm=T,BMI),
                     BMI25th=quantile(na.rm=T,BMI,probs=0.25),
                     BMI75th=quantile(na.rm=T,BMI,probs=0.75),
                     smokingNever=sum(na.rm=T,smoking=="never smoker"),
                     smokingNeverProp=sum(na.rm=T,smoking=="never smoker")/sum(!is.na(smoking)),
                     smokingPast=sum(na.rm=T,smoking=="past smoker"),
                     smokingPastProp=sum(na.rm=T,smoking=="past smoker")/sum(!is.na(smoking)),
                     smokingCurrent=sum(na.rm=T,smoking=="current smoker"),
                     smokingCurrentProp=sum(na.rm=T,smoking=="current smoker")/sum(!is.na(smoking)),
                     .groups="drop"
    )
}

reformatFun<-function(dat,var,type,rowIdx,roundDigits=NULL){
  # type needs to be one of "medianIQR", "meanSD", "nProp"
  if(type=="medianIQR"){
    if(is.null(roundDigits)){roundDigits=1}
    tmp<-as.numeric(dat[rowIdx,paste(sep="",var,c("Median","25th","75th"))])
    tmp<-c(format(nsmall=roundDigits,round(tmp[1],digits=roundDigits)),paste(sep="","(",format(nsmall=roundDigits,round(tmp[2],digits=roundDigits)),",",format(nsmall=roundDigits,round(tmp[3],digits=roundDigits)),")"))
  }else if(type=="meanSD"){
    if(is.null(roundDigits)){roundDigits=1}
    tmp<-as.numeric(dat[rowIdx,paste(sep="",var,c("Mean","Sd"))])
    tmp<-c(format(nsmall=roundDigits,round(tmp[1],digits=roundDigits)),paste(sep="","(",format(nsmall=roundDigits,round(tmp[2],digits=roundDigits)),")"))
  }else if(type=="nProp"){
    if(is.null(roundDigits)){roundDigits=1}
    tmp<-as.numeric(dat[rowIdx,paste(sep="",var,c("","Prop"))])
    tmp<-c(tmp[1],paste(sep="","(",format(nsmall=roundDigits,round(100*tmp[2],digits=roundDigits)),"%)"))
  }
  
  return(tmp)
}

demoDatAll<-simDat %>%
  sumFun()

demoDatByVacc<-simDat %>%
  group_by(VaccName) %>%
  sumFun()

demoDatKable<-data.frame(
  variable=c("Participants","SARS-CoV-2 infection","Complete follow-up data^§^","Sex: male","Age","BMI (kg/m^2^)","Smoking status: never smoker","Smoking status: past smoker","Smoking status: current smoker"),
  statistics=c("n (%)","n (%)","n (%)","n (%)","median (IQR)","median (IQR)","n (%)","n (%)","n (%)"),
  Astat1=NA,
  Astat2=NA,
  Vstat1=NA,
  Vstat2=NA,
  Cstat1=NA,
  Cstat2=NA
)

demoDatKable[1,-c(1:2)]<-c(nrow(simDatAll),
                           paste(sep="","(",format(nsmall=1,round(100*nrow(simDatAll)/nrow(simDatAll),digits=1)),"%)"),
                           sum(simDatAll$VaccName=="PCV-13"),
                           paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$VaccName=="PCV-13")/nrow(simDatAll),digits=1)),"%)"),
                           sum(simDatAll$VaccName=="Saline"),
                           paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$VaccName=="Saline")/nrow(simDatAll),digits=1)),"%)"))

demoDatKable[demoDatKable$variable=="SARS-CoV-2 infection",-c(1:2)]<-c(sum(simDatAll$covid19==1),
                           paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$covid19==1)/nrow(simDatAll),digits=1)),"%)"),
                           sum(simDatAll$VaccName=="PCV-13" & simDatAll$covid19==1),
                           paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$VaccName=="PCV-13" & simDatAll$covid19==1)/sum(simDatAll$VaccName=="PCV-13"),digits=1)),"%)"),
                           sum(simDatAll$VaccName=="Saline" & simDatAll$covid19==1),
                           paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$VaccName=="Saline" & simDatAll$covid19==1)/sum(simDatAll$VaccName=="Saline"),digits=1)),"%)"))

demoDatKable[demoDatKable$variable=="Complete follow-up data^§^",-c(1:2)]<-c(sum(simDatAll$isComplete==1),
  paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$isComplete==1)/nrow(simDatAll),digits=1)),"%)"),
  sum(simDatAll$VaccName=="PCV-13" & simDatAll$isComplete==1),
  paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$VaccName=="PCV-13" & simDatAll$isComplete==1)/sum(simDatAll$VaccName=="PCV-13"),digits=1)),"%)"),
  sum(simDatAll$VaccName=="Saline" & simDatAll$isComplete==1),
  paste(sep="","(",format(nsmall=1,round(100*sum(simDatAll$VaccName=="Saline" & simDatAll$isComplete==1)/sum(simDatAll$VaccName=="Saline"),digits=1)),"%)"))

demoDatKable[demoDatKable$variable=="Sex: male",-c(1:2)]<-c(reformatFun(dat=demoDatAll,var="sexMale",type="nProp",rowIdx=1),reformatFun(dat=demoDatByVacc,var="sexMale",type="nProp",rowIdx=2),reformatFun(dat=demoDatByVacc,var="sexMale",type="nProp",rowIdx=1))
demoDatKable[demoDatKable$variable=="Age",-c(1:2)]<-c(reformatFun(dat=demoDatAll,var="age",type="medianIQR",rowIdx=1),reformatFun(dat=demoDatByVacc,var="age",type="medianIQR",rowIdx=2),reformatFun(dat=demoDatByVacc,var="age",type="medianIQR",rowIdx=1))
demoDatKable[demoDatKable$variable=="BMI (kg/m^2^)",-c(1:2)]<-c(reformatFun(dat=demoDatAll,var="BMI",type="medianIQR",rowIdx=1),reformatFun(dat=demoDatByVacc,var="BMI",type="medianIQR",rowIdx=2),reformatFun(dat=demoDatByVacc,var="BMI",type="medianIQR",rowIdx=1))
demoDatKable[demoDatKable$variable=="Smoking status: never smoker",-c(1:2)]<-c(reformatFun(dat=demoDatAll,var="smokingNever",type="nProp",rowIdx=1),reformatFun(dat=demoDatByVacc,var="smokingNever",type="nProp",rowIdx=2),reformatFun(dat=demoDatByVacc,var="smokingNever",type="nProp",rowIdx=1))
demoDatKable[demoDatKable$variable=="Smoking status: past smoker",-c(1:2)]<-c(reformatFun(dat=demoDatAll,var="smokingPast",type="nProp",rowIdx=1),reformatFun(dat=demoDatByVacc,var="smokingPast",type="nProp",rowIdx=2),reformatFun(dat=demoDatByVacc,var="smokingPast",type="nProp",rowIdx=1))
demoDatKable[demoDatKable$variable=="Smoking status: current smoker",-c(1:2)]<-c(reformatFun(dat=demoDatAll,var="smokingCurrent",type="nProp",rowIdx=1),reformatFun(dat=demoDatByVacc,var="smokingCurrent",type="nProp",rowIdx=2),reformatFun(dat=demoDatByVacc,var="smokingCurrent",type="nProp",rowIdx=1))

demoDatKable %>%
  kable(col.names=rep("",ncol(demoDatKable)),caption="Participant characteristics (at screening).") %>%
  kable_styling(full_width=F) %>%
  add_header_above(c(" "," ","All"=2,"PCV-13"=2,"Saline"=2)) %>%
  pack_rows("Study characteristics",1,3) %>%
  pack_rows("Demographics",4,5) %>%
  pack_rows("Physiology",6,6) %>%
  pack_rows("Life style",7,9) %>%
  footnote(symbol="The totals from this row are used as the column-wide denominators to calculate percentages in rows further down in the table.",symbol_manual = c("§"))
Table 4.1: Participant characteristics (at screening).
All
PCV-13
Saline
Study characteristics
Participants n (%) 221 (100.0%) 107 (48.4%) 114 (51.6%)
SARS-CoV-2 infection n (%) 14 (6.3%) 8 (7.5%) 6 (5.3%)
Complete follow-up data§ n (%) 204 (92.3%) 98 (91.6%) 106 (93.0%)
Demographics
Sex: male n (%) 147 (72.1%) 75 (76.5%) 72 (67.9%)
Age median (IQR) 25.3 (22.9,28.5) 25.5 (23.1,28.0) 24.6 (22.8,28.5)
Physiology
BMI (kg/m2) median (IQR) 21.8 (20.2,23.9) 21.5 (20.2,24.5) 22.0 (20.3,23.7)
Life style
Smoking status: never smoker n (%) 199 (97.5%) 95 (96.9%) 104 (98.1%)
Smoking status: past smoker n (%) 4 (2.0%) 2 (2.0%) 2 (1.9%)
Smoking status: current smoker n (%) 1 (0.5%) 1 (1.0%) 0 (0.0%)
§ The totals from this row are used as the column-wide denominators to calculate percentages in rows further down in the table.
write.csv(demoDatKable,row.names=F,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_ManuscriptTable1_ParticipantCharacteristics.csv"))

4.2 Phase 1

4.2.1 Primary analysis

The main objective analysis compares the proportion of pneumococcal carriage between the PCV13 and the saline arms. The main analysis uses data from all participants, whether inoculated with the 20,000, the 80,000 or the 160,000 CFU dose. This comparison has been done using a log-binomial regression model, with predictor variables for inoculation dose group (the CFU 160,000 group will be used as reference as the largest group) and vaccine arm:

\[ \log(\pi) = \log\left(E[Y|X_{20,000},X_{80,000},X_{PCV13}]\right) = \beta_0 + \beta_{20,000}\cdot X_{20,000} + \beta_{80,000}\cdot X_{80,000} + \beta_{PCV13}\cdot X_{PCV13} \] where \(\pi\) is the probability of being a carrier, \(Y\) is a random variable indicating serotype 6B carriage status, \(X_{20,000}\) is an indicator variable for whether or not a participant was inoculated at CFU 20,000, similarly for \(X_{80,000}\), and \(X_{PCV13}\) is an indicator variable for vaccination (with PCV13 vaccine) status. \(\beta_0, \beta_{20,000}, \beta_{80,000}, \beta_{PCV13}\) are the regression coefficients that have been estimated. \(Y|X_{20,000},X_{80,000},X_{PCV13}\) is assumed to follow a binomial distribution with parameter \(\pi\).

For our main analysis of the effectiveness of the PCV13 vaccine, we will test the null hypothesis \(H_0\) of no effect of the PCV13 vaccine against the alternative hypothesis \(H_1\) that it has an effect on the carriage probability:

\[ H_0:\qquad\beta_{PCV13} = 0 \\ H_1:\qquad\beta_{PCV13} \neq 0 \]

We report the estimated risk ratio from the log-binomial regression model \(exp(\hat{\beta}_{PCV13})\) together with its 95% confidence interval (see Table 4.3).

Below we display a set of 2x2 tables (one for each inoculation dose and one for the data from all doses combined), stacked on top of each other, summarising the carriage results. While we report the p-values from exact Fisher tests for each 2x2 table, the primary analysis and conclusion are based on the p-value from the log-binomial regression as detailed above (see Table 4.3).

simDat<-simDat %>%
  mutate(doseGroup=factor(paste(sep=" ","CFU",dose),levels=c("CFU 160000","CFU 80000","CFU 20000")))

mod<-logbin::logbin(carriage~VaccName+doseGroup,data=simDat)

simDat20<-simDat %>% dplyr::filter(dose==20000)
simDat80<-simDat %>% dplyr::filter(dose==80000)
simDat160<-simDat %>% dplyr::filter(dose==160000)

tab20<-table(simDat20$VaccName,simDat20$carriage)
tab80<-table(simDat80$VaccName,simDat80$carriage)
tab160<-table(simDat160$VaccName,simDat160$carriage)
tabAll<-table(simDat$VaccName,simDat$carriage)
tabFull<-rbind(tab20,tab80,tab160,tabAll)

pFish20<-fisher.test(tab20)$p.value
pFish20<-ifelse(pFish20>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish20)))," < 0.001")
pFish80<-fisher.test(tab80)$p.value
pFish80<-ifelse(pFish80>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish80)))," < 0.001")
pFish160<-fisher.test(tab160)$p.value
pFish160<-ifelse(pFish160>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish160)))," < 0.001")
pFishAll<-fisher.test(tabAll)$p.value
pFishAll<-ifelse(pFishAll>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishAll)))," < 0.001")

pGlm<-summary(mod)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(mod)$coefficients["VaccNamePCV-13","Estimate"],confint(mod)["VaccNamePCV-13",]))

tabFull %>%
  kable(caption=paste(sep="","Number of participants colonised with experimental S. Pneumoniae serotype 6B by study arm and inoculation dose. According to a log-binomial regression model which also adjusts for inoculation dose, there is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),col.names=c("Not colonised","Colonised"),row.names=T) %>%
  kable_styling(full_width = F) %>%
  pack_rows(paste(sep="","CFU 20,000 (Fisher test p",pFish20,")"),1,2) %>%
  pack_rows(paste(sep="","CFU 80,000 (Fisher test p",pFish80,")"),3,4) %>%
  pack_rows(paste(sep="","CFU 160,000 (Fisher test p",pFish160,")"),5,6) %>%
  pack_rows(paste(sep="","All doses (Fisher test p",pFishAll,")"),7,8)
Table 4.2: Number of participants colonised with experimental S. Pneumoniae serotype 6B by study arm and inoculation dose. According to a log-binomial regression model which also adjusts for inoculation dose, there is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.38, 95% CI (0.20, 0.72), p = 0.003).
Not colonised Colonised
CFU 20,000 (Fisher test p = 0.219)
Saline 17 2
PCV-13 21 0
CFU 80,000 (Fisher test p = 0.292)
Saline 29 12
PCV-13 27 6
CFU 160,000 (Fisher test p = 0.005)
Saline 30 16
PCV-13 40 4
All doses (Fisher test p = 0.001)
Saline 76 30
PCV-13 88 10
tabFullAugmented<-data.frame(
  DoseGroup=c(rep("CFU 20,000",2),rep("CFU 80,000",2),rep("CFU 160,000",2),rep("All",2)),
  StudyArm=rownames(tabFull),
  carriageNegative=tabFull[,1],
  carriageNegative=tabFull[,2],
  p.value=c(pFish20,NA,pFish80,NA,pFish160,NA,pFishAll,NA)
)

write.csv(tabFullAugmented,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmInoculationDose.csv"))

We also summarise the log-binomial model fit, reporting the estimated coefficients, their standard errors and associated p-values.

modRes<-summary(mod)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(mod)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(mod)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(mod)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(mod)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  dplyr::mutate(Estimate=exp(Estimate)) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). The 'Estimate' column in the table below shows the baseline risk on the '(Intercept)' row and the relative risk associated with each of the exposures on the subsequent rows."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.3: Summary of the log-binomial regression model fit. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 33%, 95% CI (22.0%, 48.4%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.38, 95% CI (0.20, 0.72), p = 0.003). The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.
Estimate Std. error Z statistic p-value
(Intercept) 0.327 0.201 -5.573 0.000
PCV-13 vaccine 0.376 0.333 -2.936 0.003
Dose: CFU 80,000 1.000 0.276 0.000 1.000
Dose: CFU 20,000 0.231 0.706 -2.077 0.038
tt<-modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(Estimate=exp(Estimate))

write.csv(tt,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_ManuscriptTable3_carriageLogBinomialPrimaryAnalysisModelSummary.csv"))

We summarise the carriage data in Figure 4.2.

plotDat<-simDat %>%
  group_by(vaccInoDose,VaccName,dose) %>%
  summarise(
    n=n(),
    k=sum(carriage),
    .groups="drop") %>%
  mutate(
    prop=k/n,
    propLow=NA,
    propUpp=NA
  )

plotDatTmp<-simDat %>%
  group_by(VaccName) %>%
  summarise(
    n=n(),
    k=sum(carriage),
    .groups="drop") %>%
  mutate(
    prop=k/n,
    propLow=NA,
    propUpp=NA
  ) %>%
  mutate(
    dose="(any dose)",
    vaccInoDose=paste(sep=" ",VaccName,"(any dose)")
  )

plotDat<-rbind(plotDat,plotDatTmp) %>%
  dplyr::select(vaccInoDose,VaccName,dose,n,k,prop,propLow,propUpp)

for(j in 1:nrow(plotDat)){
  ci<-binom.test(x=plotDat$k[j],n=plotDat$n[j])$conf.int
  plotDat$propLow[j]<-ci[1]
  plotDat$propUpp[j]<-ci[2] 
}

rm(ci,plotDatTmp)

plotDat$dose<-fct_recode(factor(plotDat$dose,levels=c("20000","80000","160000","(any dose)")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","(any dose)"="(any dose)")
plotDat<-plotDat[order(plotDat$VaccName,plotDat$dose),]

plotDat<-plotDat %>%
  mutate(col=c(blueCols,orangeCols))

plotDat$vaccInoDose<-factor(plotDat$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))

yMax<-max(c(plotDat$propUpp))

g1<-plotDat %>%
  filter(VaccName=="Saline") %>%
  ggplot(mapping=aes(x=dose,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9) +
  geom_errorbar(width=0.2,col="grey") +
  geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c(rgb(70+40,130+40,180+40,maxColorValue=255),rgb(70,130,180,maxColorValue=255),rgb(70-40,130-40,180-40,maxColorValue=255),rgb(70-70,130-100,180-100,maxColorValue=255))) +
  xlab("Study arm & inoculation dose") +
  ylab("Carriage proportion") +
  guides(fill="none") +
  theme_light() +
  labs(title="Saline") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=3.5,col="darkgrey",lwd=1.25,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),text=element_text(size=16))

g2<-plotDat %>%
  filter(VaccName=="PCV-13") %>%
  ggplot(mapping=aes(x=dose,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9) +
  geom_errorbar(width=0.2,col="grey") +
  geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c(rgb(245+10,165+40,0,maxColorValue=255),rgb(245,165,0,maxColorValue=255),rgb(245-10,165-40,0,maxColorValue=255),rgb(245-30,165-120,0,maxColorValue=255))) +
  xlab("Study arm & inoculation dose") +
  ylab("Carriage proportion") +
  guides(fill="none") +
  theme_light() +
  labs(title="PCV-13") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=3.5,col="darkgrey",lwd=1.5,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),text=element_text(size=16))

pFish20<-as.numeric(gsub(pattern=" = ",replacement="",pFish20))
pFish80<-as.numeric(gsub(pattern=" = ",replacement="",pFish80))
pFish160<-as.numeric(gsub(pattern=" = ",replacement="",pFish160))
pFishAll<-as.numeric(gsub(pattern=" = ",replacement="",pFishAll))

figCap<-paste(sep="","Carriers out of total participants are indicated below each bar.\nLog-binomial regression p-value for PCV-13 vaccination coefficient, p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),".\nFisher's exact test p-value (overall carriage) ",ifelse(pFishAll>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pFishAll))),"<0.001"),".\nFisher's exact test p-value at doses 20,000, 80,000 and 160,000 CFU = ",paste(collapse=", ",format(nsmall=3,round(digits=3,c(pFish20,pFish80,pFish160)))),".")
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, x = unit(1, "npc"), just="right", gp=gpar(fontsize=10)))
Serotype 6B carriage proportions per study arm and inoculation dose.

Figure 4.2: Serotype 6B carriage proportions per study arm and inoculation dose.

pdf(width=12,height=8,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmInoculationDose.pdf"))
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, x = unit(1, "npc"), just="right", gp=gpar(fontsize=10)))
dev.off()
## quartz_off_screen 
##                 2
png(width=12,height=8,res=450,units="in",file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmInoculationDose.png"))
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, x = unit(1, "npc"), just="right", gp=gpar(fontsize=10)))
dev.off()
## quartz_off_screen 
##                 2
4.2.1.0.1 Experimental carriage by study arm, inoculation dose and study visit
datCarrStudyArmDoseVisit<-simDat %>%
  dplyr::select(c(pid,VaccName,doseGroup,carriageVisitE,carriageVisitF,carriageVisitG,carriage)) %>%
  dplyr::mutate(doseGroup=factor(doseGroup,levels=c("CFU 20000","CFU 80000","CFU 160000"))) %>%
  tidyr::pivot_longer(cols=c(carriageVisitE,carriageVisitF,carriageVisitG,carriage),names_to="Visit",values_to="carriage6B") %>%
  dplyr::mutate(Visit=gsub(pattern="carriageVisit",replacement="Visit ",Visit)) %>%
  dplyr::group_by(Visit,VaccName,doseGroup) %>%
  dplyr::summarise(
    k=sum(carriage6B),
    n=n(),
    prop=mean(carriage6B),
    .groups="drop"
  ) %>%
  dplyr::mutate(
    VaccName=forcats::fct_recode(VaccName,"Saline"="Saline","PCV13"="PCV-13"),
    doseGroup=forcats::fct_recode(doseGroup,"CFU 20,000"="CFU 20000","CFU 80,000"="CFU 80000","CFU 160,000"="CFU 160000"),
    propCILow=NA,
    propCIUpp=NA,
    propFormatted=NA
  )

datCarrStudyArmDoseVisit$Visit[datCarrStudyArmDoseVisit$Visit=="carriage"]<-"Any day"
datCarrStudyArmDoseVisit$Visit<-factor(datCarrStudyArmDoseVisit$Visit,levels=c("Visit E","Visit F","Visit G","Any day"))
datCarrStudyArmDoseVisit<-datCarrStudyArmDoseVisit %>%
  dplyr::arrange(Visit)

for(i in 1:nrow(datCarrStudyArmDoseVisit)){
  tmp<-binom.test(x=datCarrStudyArmDoseVisit$k[i],n=datCarrStudyArmDoseVisit$n[i])
  datCarrStudyArmDoseVisit$propCILow[i]<-tmp$conf.int[1]
  datCarrStudyArmDoseVisit$propCIUpp[i]<-tmp$conf.int[2]
  datCarrStudyArmDoseVisit$propFormatted[i]<-paste(sep="",format(nsmall=1,round(digits=1,100*datCarrStudyArmDoseVisit$prop[i])),"% (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*c(datCarrStudyArmDoseVisit$propCILow[i],datCarrStudyArmDoseVisit$propCIUpp[i])))),"%)")
}

datCarrStudyArmDoseVisitKable<-datCarrStudyArmDoseVisit %>%
  dplyr::select(Visit,VaccName,doseGroup,k,n,propFormatted) %>%
  tidyr::pivot_wider(id_cols = c(Visit,doseGroup),names_from=VaccName,values_from = c(k,n,propFormatted)) %>%
  dplyr::select(c(Visit,doseGroup,contains("Saline"),contains("PCV13")))


datCarrStudyArmDoseVisitKable %>%
  dplyr::select(-Visit) %>%
  knitr::kable(row.names=F,col.names=c("","k","n","proportion (95% CI)","k","n","proportion (95% CI)")) %>%
  kableExtra::kable_styling(full_width = FALSE) %>%
  kableExtra::pack_rows(start_row=1,end_row=3,group_label="Day 2") %>%
  kableExtra::pack_rows(start_row=4,end_row=6,group_label="Day 7") %>%
  kableExtra::pack_rows(start_row=7,end_row=9,group_label="Day 14") %>%
  kableExtra::pack_rows(start_row=10,end_row=12,group_label="Any day") %>%
  kableExtra::add_header_above(c(" "=1,"Saline"=3,"PCV-13"=3))
Saline
PCV-13
k n proportion (95% CI) k n proportion (95% CI)
Day 2
CFU 20,000 2 19 10.5% ( 1.3%, 33.1%) 0 21 0.0% ( 0.0%, 16.1%)
CFU 80,000 11 41 26.8% (14.2%, 42.9%) 4 33 12.1% ( 3.4%, 28.2%)
CFU 160,000 9 46 19.6% ( 9.4%, 33.9%) 4 44 9.1% ( 2.5%, 21.7%)
Day 7
CFU 20,000 1 19 5.3% ( 0.1%, 26.0%) 0 21 0.0% ( 0.0%, 16.1%)
CFU 80,000 8 41 19.5% ( 8.8%, 34.9%) 3 33 9.1% ( 1.9%, 24.3%)
CFU 160,000 9 46 19.6% ( 9.4%, 33.9%) 3 44 6.8% ( 1.4%, 18.7%)
Day 14
CFU 20,000 1 19 5.3% ( 0.1%, 26.0%) 0 21 0.0% ( 0.0%, 16.1%)
CFU 80,000 8 41 19.5% ( 8.8%, 34.9%) 1 33 3.0% ( 0.1%, 15.8%)
CFU 160,000 10 46 21.7% (10.9%, 36.4%) 3 44 6.8% ( 1.4%, 18.7%)
Any day
CFU 20,000 2 19 10.5% ( 1.3%, 33.1%) 0 21 0.0% ( 0.0%, 16.1%)
CFU 80,000 12 41 29.3% (16.1%, 45.5%) 6 33 18.2% ( 7.0%, 35.5%)
CFU 160,000 16 46 34.8% (21.4%, 50.2%) 4 44 9.1% ( 2.5%, 21.7%)
write.csv(datCarrStudyArmDoseVisitKable,row.names=F,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_ManuscriptTable2_CarriageByStudyVisitInoculationDoseStudyArm.csv"))
# <!-- ##### Confirming carriage at Visits F and G by LytA PCR -->

datLytAVisitE<-read.csv(fileLytAVisitE)
datLytAVisitF<-read.csv(fileLytAVisitF)
datLytAVisitG<-read.csv(fileLytAVisitG)

datLytAVisitE<-datLytAVisitE %>%
  dplyr::mutate(pid=datFollowUp$pid[match(datLytAVisitE$SAMPLE.ID,datFollowUp$wash)])

datLytAVisitF<-datLytAVisitF %>%
  dplyr::mutate(pid=datFollowUp$pid[match(datLytAVisitF$Sample_id,datFollowUp$wash)])

datLytAVisitG<-datLytAVisitG %>%
  dplyr::mutate(pid=datFollowUp$pid[match(datLytAVisitG$SAMPLE.ID,datFollowUp$wash)])

simDat<-simDat %>%
  dplyr::mutate(
    carriage6BLytAVisitE=ifelse((datLytAVisitE$LytA.6B.PCR=="Positive" & datLytAVisitE$LytA.6B.Serotype=="6B")[match(simDat$pid,datLytAVisitE$pid)],1,0),
    carriageNot6BLytAVisitE=ifelse((datLytAVisitE$LytA.6B.PCR=="Positive" & datLytAVisitE$LytA.6B.Serotype=="Non 6B")[match(simDat$pid,datLytAVisitE$pid)],1,0),
    carriage6BLytAVisitF=ifelse((datLytAVisitF$LytA.6B.PCR=="Positive" & datLytAVisitF$LytA.6B.Serotype=="6B")[match(simDat$pid,datLytAVisitF$pid)],1,0),
    carriageNot6BLytAVisitF=ifelse((datLytAVisitF$LytA.6B.PCR=="Positive" & datLytAVisitF$LytA.6B.Serotype=="Non 6B")[match(simDat$pid,datLytAVisitF$pid)],1,0),
    carriage6BLytAVisitG=ifelse((datLytAVisitG$LytA.6B.PCR=="Positive" & datLytAVisitG$LytA.6B.Serotype=="6B")[match(simDat$pid,datLytAVisitG$pid)],1,0),
    carriageNot6BLytAVisitG=ifelse((datLytAVisitG$LytA.6B.PCR=="Positive" & datLytAVisitG$LytA.6B.Serotype=="Non 6B")[match(simDat$pid,datLytAVisitG$pid)],1,0),
  )

missPIDsVisitE<-setdiff(simDat$pid,datLytAVisitE$pid)
missPIDsVisitF<-setdiff(simDat$pid,datLytAVisitF$pid)
missPIDsVisitG<-setdiff(simDat$pid,datLytAVisitG$pid)

missWashIDsVisitE<-(datFollowUp %>% dplyr::filter(pid %in% missPIDsVisitE & visit=="E") %>% dplyr::select(wash))[,1]
missWashIDsVisitF<-(datFollowUp %>% dplyr::filter(pid %in% missPIDsVisitF & visit=="F") %>% dplyr::select(wash))[,1]
missWashIDsVisitG<-(datFollowUp %>% dplyr::filter(pid %in% missPIDsVisitG & visit=="G") %>% dplyr::select(wash))[,1]

cultPostPCRnegVisitE<-simDat$pid[simDat$carriageVisitE==1 & !is.na(simDat$carriage6BLytAVisitE) & simDat$carriage6BLytAVisitE==0]
cultPostPCRnegVisitF<-simDat$pid[simDat$carriageVisitF==1 & !is.na(simDat$carriage6BLytAVisitF) & simDat$carriage6BLytAVisitF==0]
cultPostPCRnegVisitG<-simDat$pid[simDat$carriageVisitG==1 & !is.na(simDat$carriage6BLytAVisitG) & simDat$carriage6BLytAVisitG==0]

dfE<-table(simDat$carriageVisitE,simDat$carriage6BLytAVisitE,useNA="always")[1:2,]
dfF<-table(simDat$carriageVisitF,simDat$carriage6BLytAVisitF,useNA="always")[1:2,]
dfG<-table(simDat$carriageVisitG,simDat$carriage6BLytAVisitG,useNA="always")[1:2,]

dfE<-data.frame(
  Culture6B=c("Culture negative (6B)","Culture positive (6B)"),
  LytA6BNegative=dfE[,1],
  LytA6BPositive=dfE[,2],
  LytA6BUnknown=dfE[,3]
)

dfF<-data.frame(
  Culture6B=c("Culture negative (6B)","Culture positive (6B)"),
  LytA6BNegative=dfF[,1],
  LytA6BPositive=dfF[,2],
  LytA6BUnknown=dfF[,3]
)


dfG<-data.frame(
  Culture6B=c("Culture negative (6B)","Culture positive (6B)"),
  LytA6BNegative=dfG[,1],
  LytA6BPositive=dfG[,2],
  LytA6BUnknown=dfG[,3]
)

dfE %>%
  kable(col.names=c("","Negative","Positive","Unknown"),row.names = FALSE,caption=paste(sep="","Culture vs LytA PCR results for 6B carriage at visit E.\nThese are the culture positive, LytA PCR negative samples: < ",paste(collapse=", ",cultPostPCRnegVisitE)," >.\nThese are the samples with no LytA PCR result: < ",paste(collapse=" ",missPIDsVisitE)," >.")) %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(c(" "=1,"LytA PCR"=3))

dfF %>%
  kable(col.names=c("","Negative","Positive","Unknown"),row.names = FALSE,caption=paste(sep="","Culture vs LytA PCR results for 6B carriage at visit F.\nThese are the culture positive, LytA PCR negative samples: < ",paste(collapse=", ",cultPostPCRnegVisitF)," >.\nThese are the samples with no LytA PCR result: < ",paste(collapse=" ",missPIDsVisitF)," >.")) %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(c(" "=1,"LytA PCR"=3))

dfG %>%
  kable(col.names=c("","Negative","Positive","Unknown"),row.names = FALSE,caption=paste(sep="","Culture vs LytA PCR results for 6B carriage at visit G.\nThese are the culture positive, LytA PCR negative samples: < ",paste(collapse=", ",cultPostPCRnegVisitG)," >.\nThese are the samples with no LytA PCR result: < ",paste(collapse=" ",missPIDsVisitG)," >.")) %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(c(" "=1,"LytA PCR"=3))

#<!-- These are the PIDs with missing wash data for Visit F: `r paste(collapse=" ",missPIDsVisitF)` and these are the associated wash IDs: `r paste(collapse=" ",missWashIDsVisitF)`. -->

#<!-- These are the PIDs with missing wash data for Visit G: `r paste(collapse=" ",missPIDsVisitG)` and these are the associated wash IDs: `r paste(collapse=" ",missWashIDsVisitG)`. -->
cultPosPcrNegPIDs<-c(cultPostPCRnegVisitE,cultPostPCRnegVisitF,cultPostPCRnegVisitG)

simDat %>%
dplyr::filter(pid %in% cultPosPcrNegPIDs) %>%
  dplyr::select(c(pid,carriageVisitE,carriageNot6BVisitE,carriage6BLytAVisitE,carriageNot6BLytAVisitE,carriageVisitF,carriageNot6BVisitF,carriage6BLytAVisitF,carriageNot6BLytAVisitF,carriageVisitG,carriageNot6BVisitG,carriage6BLytAVisitG,carriageNot6BLytAVisitG)) %>%
  tidyr::pivot_longer(cols=-c(pid),names_to=c(".value","Visit"),names_pattern="(.+)Visit(.+)") %>%
  dplyr::mutate(
    carriage=case_when(carriage==1~"Positive",TRUE~"Negative"),
    carriageNot6B=case_when(carriageNot6B==1~"Positive",TRUE~"Negative"),
    carriage6BLytA=case_when(carriage6BLytA==1~"Positive",TRUE~"Negative"),
    carriageNot6BLytA=case_when(carriageNot6BLytA==1~"Positive",TRUE~"Negative")
  ) %>%
  dplyr::filter(carriage=="Positive" & carriage6BLytA=="Negative") %>%
  dplyr::arrange(Visit) %>%
  kable(col.names = c("PID","Visit","6B","Not 6B","6B","Not 6B"),caption="Summary of culture postive (for 6B) and LytA PCR negative samples.") %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(header=c(" "=2,"Culture"=2,"LytA PCR"=2))

4.2.1.1 Sensitivity analysis: confounding by natural carriage

Natural carriage may act as a confounder on experimental carriage. In Figure 4.3 below we summarise the natural carriages rates at visits B, C, E, F and G.

simDat<-simDat %>%
  dplyr::mutate(
    carriageNot6BVTVisitB=NA,
    carriageNot6BVTVisitC=NA,
    carriageNot6BVTVisitE=NA,
    carriageNot6BVTVisitF=NA,
    carriageNot6BVTVisitG=NA,
    carriageNot6BNVTVisitB=NA,
    carriageNot6BNVTVisitC=NA,
    carriageNot6BNVTVisitE=NA,
    carriageNot6BNVTVisitF=NA,
    carriageNot6BNVTVisitG=NA
  )

tmp<-datCarriageSerotype %>% dplyr::filter(Visit=="B")
simDat$carriageNot6BVTVisitB<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural VT",1,0)
simDat$carriageNot6BNVTVisitB<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural NVT",1,0)

tmp<-datCarriageSerotype %>% dplyr::filter(Visit=="C")
simDat$carriageNot6BVTVisitC<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural VT",1,0)
simDat$carriageNot6BNVTVisitC<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural NVT",1,0)

tmp<-datCarriageSerotype %>% dplyr::filter(Visit=="E")
simDat$carriageNot6BVTVisitE<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural VT",1,0)
simDat$carriageNot6BNVTVisitE<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural NVT",1,0)

tmp<-datCarriageSerotype %>% dplyr::filter(Visit=="F")
simDat$carriageNot6BVTVisitF<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural VT",1,0)
simDat$carriageNot6BNVTVisitF<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural NVT",1,0)

tmp<-datCarriageSerotype %>% dplyr::filter(Visit=="G")
simDat$carriageNot6BVTVisitG<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural VT",1,0)
simDat$carriageNot6BNVTVisitG<-ifelse(!is.na(tmp$carriageWithVT[match(simDat$pid,tmp$pid)]) & tmp$carriageWithVT[match(simDat$pid,tmp$pid)]=="Natural NVT",1,0)

simDat<-simDat %>%
  dplyr::mutate(
    carriageNot6BVT=ifelse(carriageNot6BVTVisitB==1 | carriageNot6BVTVisitC==1 | carriageNot6BVTVisitE==1 | carriageNot6BVTVisitF==1 | carriageNot6BVTVisitG==1,1,0),
    carriageNot6BNVT=ifelse(carriageNot6BNVTVisitB==1 | carriageNot6BNVTVisitC==1 | carriageNot6BNVTVisitE==1 | carriageNot6BNVTVisitF==1 | carriageNot6BNVTVisitG==1,1,0)
  )

prepPlotDat<-function(var,visit){
  plotDatNC<-simDat %>%
    group_by(vaccInoDose,VaccName,dose) %>%
    summarise(
      n=n(),
      k=sum(get(var)),
      .groups="drop") %>%
    mutate(
      prop=k/n,
      propLow=NA,
      propUpp=NA
    )
  
  plotDatTmp<-simDat %>%
    group_by(VaccName) %>%
    summarise(
      n=n(),
      k=sum(get(var)),
      .groups="drop") %>%
    mutate(
      prop=k/n,
      propLow=NA,
      propUpp=NA
    ) %>%
    mutate(
      dose="(any dose)",
      vaccInoDose=paste(sep=" ",VaccName,"(any dose)")
    )
  
  plotDatNC<-rbind(plotDatNC,plotDatTmp) %>%
    dplyr::select(vaccInoDose,VaccName,dose,n,k,prop,propLow,propUpp) %>%
    as.data.frame()
  
  plotDatNC$Visit<-visit
  
  return(plotDatNC)
}

plotDatNC<-prepPlotDat(var="carriageNot6BVisitB",visit="VisitB")
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVisitC",visit="VisitC"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVisitE",visit="VisitE"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVisitF",visit="VisitF"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVisitG",visit="VisitG"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6B",visit="Overall"))

for(j in 1:nrow(plotDatNC)){
  ci<-binom.test(x=plotDatNC$k[j],n=plotDatNC$n[j])$conf.int
  plotDatNC$propLow[j]<-ci[1]
  plotDatNC$propUpp[j]<-ci[2] 
}

rm(ci)

plotDatNC$dose<-fct_recode(factor(plotDatNC$dose,levels=c("20000","80000","160000","(any dose)")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","(any dose)"="(any dose)")
plotDatNC<-plotDatNC[order(plotDatNC$VaccName,plotDatNC$dose),]

#plotDatNC<-plotDatNC %>%
#  mutate(col=c(blueCols5,orangeCols5))

plotDatNC$vaccInoDose<-factor(plotDatNC$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))

yMax<-max(c(plotDatNC$propUpp))

plotDatNC$Visit<-factor(plotDatNC$Visit,levels=c(paste(sep="","Visit",c("B","C","E","F","G")),"Overall"))
plotDatNC$Visit<-forcats::fct_recode(plotDatNC$Visit,"Pre-vaccination"="VisitB","Post-vaccination"="VisitC","Day 2"="VisitE","Day 7"="VisitF","Day 14"="VisitG","Overall"="Overall")

dodge<-position_dodge(width=0.9)

g1<-plotDatNC %>%
  filter(VaccName=="Saline") %>%
  ggplot(mapping=aes(x=Visit,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9,position="dodge") +
  geom_errorbar(col="grey",position = "dodge",width=0.25) +
  #geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c(rgb(70+60,130+60,180+60,maxColorValue=255),rgb(70+40,130+40,180+40,maxColorValue=255),rgb(70,130,180,maxColorValue=255),rgb(70-40,130-40,180-40,maxColorValue=255),rgb(70-70,130-100,180-100,maxColorValue=255))) +
  xlab("Study arm & inoculation dose") +
  ylab("Carriage proportion") +
  #guides(fill="none") +
  theme_light() +
  labs(title="Saline") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=5.5,col="darkgrey",lwd=1.25,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),legend.position = "bottom")

g2<-plotDatNC %>%
  filter(VaccName=="PCV-13") %>%
  ggplot(mapping=aes(x=Visit,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9,position="dodge",width=0.25) +
  geom_errorbar(col="grey",position = "dodge") +
  #geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c(rgb(245+10,165+80,0,maxColorValue=255),rgb(245+10,165+40,0,maxColorValue=255),rgb(245,165,0,maxColorValue=255),rgb(245-10,165-40,0,maxColorValue=255),rgb(245-30,165-120,0,maxColorValue=255))) +
  xlab("Study arm & inoculation dose") +
  ylab("Carriage proportion") +
  #guides(fill="none") +
  theme_light() +
  labs(title="PCV-13") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=5.5,col="darkgrey",lwd=1.25,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),legend.position = "bottom")

g3<-plotDatNC %>%
  filter(dose=="(any dose)") %>%
  ggplot(mapping=aes(x=Visit,y=prop,ymin=propLow,ymax=propUpp,fill=VaccName,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9,position=dodge) +
  geom_errorbar(col="grey",position = dodge,width=0.25) +
  #geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c("steelblue","orange"),name="Study arm") +
  xlab("") +
  ylab("Natural carriage proportion") +
  #guides(fill="none") +
  theme_light() +
  labs(caption="Natural carriage proportions per study arm at the different study visits.\nThe right-most pair of bars show overall carriage, defined as carriage at any study visit.") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=5.5,col="darkgrey",lwd=1.25,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),legend.position = "right",text = element_text(size=18))

figCap<-paste(sep="","Natural carriage proportions by study arm and visit.")

print(g3)
Natural pneumococcal carriage by study arm and visit.

Figure 4.3: Natural pneumococcal carriage by study arm and visit.

pdf(width=12,height=8,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmVisit.pdf"))
print(g3)
dev.off()
## quartz_off_screen 
##                 2
png(width=12,height=8,res=450,units="in",file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmVisit.png"))
print(g3)
dev.off()
## quartz_off_screen 
##                 2
plotDatNC<-prepPlotDat(var="carriageNot6BVTVisitB",visit="VisitB")
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVTVisitC",visit="VisitC"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVTVisitE",visit="VisitE"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVTVisitF",visit="VisitF"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVTVisitG",visit="VisitG"))
plotDatNC<-rbind(plotDatNC,prepPlotDat(var="carriageNot6BVT",visit="Overall"))

plotDatNC<-plotDatNC %>%
  mutate(serotype="Vaccine-type")

plotDatNC2<-prepPlotDat(var="carriageNot6BNVTVisitB",visit="VisitB")
plotDatNC2<-rbind(plotDatNC2,prepPlotDat(var="carriageNot6BNVTVisitC",visit="VisitC"))
plotDatNC2<-rbind(plotDatNC2,prepPlotDat(var="carriageNot6BNVTVisitE",visit="VisitE"))
plotDatNC2<-rbind(plotDatNC2,prepPlotDat(var="carriageNot6BNVTVisitF",visit="VisitF"))
plotDatNC2<-rbind(plotDatNC2,prepPlotDat(var="carriageNot6BNVTVisitG",visit="VisitG"))
plotDatNC2<-rbind(plotDatNC2,prepPlotDat(var="carriageNot6BNVT",visit="Overall"))

plotDatNC2<-plotDatNC2 %>%
  mutate(serotype="Not vaccine-type")

plotDatNC<-rbind(plotDatNC,plotDatNC2)
rm(plotDatNC2)

for(j in 1:nrow(plotDatNC)){
  ci<-binom.test(x=plotDatNC$k[j],n=plotDatNC$n[j])$conf.int
  plotDatNC$propLow[j]<-ci[1]
  plotDatNC$propUpp[j]<-ci[2] 
}

rm(ci)

plotDatNC$dose<-fct_recode(factor(plotDatNC$dose,levels=c("20000","80000","160000","(any dose)")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","(any dose)"="(any dose)")
plotDatNC<-plotDatNC[order(plotDatNC$VaccName,plotDatNC$dose),]

#plotDatNC<-plotDatNC %>%
#  mutate(col=c(blueCols5,orangeCols5))

plotDatNC$vaccInoDose<-factor(plotDatNC$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))

yMax<-max(c(plotDatNC$propUpp))

plotDatNC$Visit<-factor(plotDatNC$Visit,levels=c(paste(sep="","Visit",c("B","C","E","F","G")),"Overall"))
plotDatNC$Visit<-forcats::fct_recode(plotDatNC$Visit,"Pre-vaccination"="VisitB","Post-vaccination"="VisitC","Day 2"="VisitE","Day 7"="VisitF","Day 14"="VisitG","Overall"="Overall")

dodge<-position_dodge(width=0.9)

g4<-plotDatNC %>%
  filter(dose=="(any dose)") %>%
  ggplot(mapping=aes(x=Visit,y=prop,ymin=propLow,ymax=propUpp,fill=VaccName,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9,position=dodge) +
  geom_errorbar(col="grey",position = dodge,width=0.25) +
  #geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c("steelblue","orange"),name="Study arm") +
  xlab("") +
  ylab("Natural carriage proportion") +
  #guides(fill="none") +
  theme_light() +
  labs(caption="Natural carriage proportions per study arm at the different study visits.\nThe right-most pair of bars show overall carriage, defined as carriage at any study visit.\nThe graph is stratified by whether carriage was vaccine-type or not.") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=5.5,col="darkgrey",lwd=1.25,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),legend.position = "right",text = element_text(size=18)) +
  facet_wrap(~serotype,nrow=2)

png(width=12,height=12,res=450,units="in",file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmVisit_ByVaccineTypeStatus.png"))
print(g4)
dev.off()
## quartz_off_screen 
##                 2
print(g4)
Natural pneumococcal carriage by study arm and visit, stratified by whether carriage was vaccine-type or not.

Figure 4.4: Natural pneumococcal carriage by study arm and visit, stratified by whether carriage was vaccine-type or not.

To investigate confounding, we will first tabulate the proportions of experimental carriage among i) individuals who were natural carriers at visit C and those who were not carriers at that visit, ii) individuals who were natural carriers at any of visits C, E, F, G and those who did not become natural carriers at all. We will stratify these tabulations by randomisation arm.

datTabNatCarr<-data.frame(
  vaccName=c(rep(levels(simDat$VaccName)[1],2),rep(levels(simDat$VaccName)[2],2)),
  natCarr=rep(c("No natural carriage","Natural carriage"),2),
  k=NA,
  n=NA,
  perc=NA
)

datTmp<-table(simDat$carriageNot6B,simDat$carriage,simDat$VaccName)

for(j in 1:nrow(datTabNatCarr)){
  datTabNatCarr$k[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"1",datTabNatCarr$vaccName[j]]
  datTabNatCarr$n[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"0",datTabNatCarr$vaccName[j]]+datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"1",datTabNatCarr$vaccName[j]]
}

datTabNatCarr$perc<-paste(sep="",format(nsmall=2,round(digits=2,100*datTabNatCarr$k/datTabNatCarr$n)),"%")

datTabNatCarr %>%
  dplyr::select(!vaccName) %>%
  knitr::kable(col.names=c("","6B carriers","Total number of participants","Percentage of 6B carriers"),caption="Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at any of visits C, E, F, G.") %>%
  kableExtra::kable_styling(full_width = FALSE) %>%
  kableExtra::pack_rows(group_label="Saline",1,2) %>%
  kableExtra::pack_rows(group_label="PCV-13",3,4)
Table 4.4: Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at any of visits C, E, F, G.
6B carriers Total number of participants Percentage of 6B carriers
Saline
No natural carriage 20 64 31.25%
Natural carriage 10 42 23.81%
PCV-13
No natural carriage 5 55 9.09%
Natural carriage 5 43 11.63%
datTmp<-table(simDat$carriageNot6BVisitC,simDat$carriage,simDat$VaccName)

for(j in 1:nrow(datTabNatCarr)){
  datTabNatCarr$k[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"1",datTabNatCarr$vaccName[j]]
  datTabNatCarr$n[j]<-datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"0",datTabNatCarr$vaccName[j]]+datTmp[ifelse(datTabNatCarr$natCarr[j]=="No natural carriage","0","1"),"1",datTabNatCarr$vaccName[j]]
}

datTabNatCarr$perc<-paste(sep="",format(nsmall=2,round(digits=2,100*datTabNatCarr$k/datTabNatCarr$n)),"%")

datTabNatCarr<-datTabNatCarr %>%
  dplyr::mutate(RR=NA)

datTabNatCarr$RR[datTabNatCarr$natCarr=="Natural carriage" & datTabNatCarr$vaccName=="Saline"]<-format(nsmall=1,round(digits=2,(datTabNatCarr$k/datTabNatCarr$n)[datTabNatCarr$vaccName=="Saline" & datTabNatCarr$natCarr=="Natural carriage"]/(datTabNatCarr$k/datTabNatCarr$n)[datTabNatCarr$vaccName=="Saline" & datTabNatCarr$natCarr=="No natural carriage"]))

datTabNatCarr$RR[datTabNatCarr$natCarr=="Natural carriage" & datTabNatCarr$vaccName=="PCV-13"]<-format(nsmall=1,round(digits=2,(datTabNatCarr$k/datTabNatCarr$n)[datTabNatCarr$vaccName=="PCV-13" & datTabNatCarr$natCarr=="Natural carriage"]/(datTabNatCarr$k/datTabNatCarr$n)[datTabNatCarr$vaccName=="PCV-13" & datTabNatCarr$natCarr=="No natural carriage"]))

options(knitr.kable.NA = '')

datTabNatCarr %>%
  dplyr::select(!vaccName) %>%
  knitr::kable(col.names=c("","SPN6B carriers","Total number of participants","Percentage of carriers (95% CI)","RR"),caption="Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at visit C. Risk ratios are for experimental carriage, comparing natural carriers to individuals without natural carriage.") %>%
  kableExtra::kable_styling(full_width = FALSE) %>%
  kableExtra::pack_rows(group_label="Saline",1,2) %>%
  kableExtra::pack_rows(group_label="PCV-13",3,4)
Table 4.5: Tabulated counts and percentages of experimental carriers stratified by natural carriage and randomisation arm. Natural carriage is defined here as being a carrier of a non-6B strain at visit C. Risk ratios are for experimental carriage, comparing natural carriers to individuals without natural carriage.
SPN6B carriers Total number of participants Percentage of carriers (95% CI) RR
Saline
No natural carriage 21 81 25.93%
Natural carriage 9 25 36.00% 1.39
PCV-13
No natural carriage 6 75 8.00%
Natural carriage 4 23 17.39% 2.17

We can present the same results slightly differently and with risk ratios (comparing PCV13 against saline) for experimental carriage (Table 4.6).

datTabNatCarr<-datTabNatCarr %>%
  dplyr::arrange(desc(natCarr),desc(vaccName)) %>%
  dplyr::mutate(
    prop=k/n,
    RR=NA
  )

for(i in 1:nrow(datTabNatCarr)){
  tmp<-binom.test(x=datTabNatCarr$k[i],n=datTabNatCarr$n[i])
  datTabNatCarr$perc[i]<-paste(sep="",format(nsmall=1,round(digits=1,100*datTabNatCarr$k[i]/datTabNatCarr$n[i])),"% (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*tmp$conf.int))),"%)")
}  

datTabNatCarr$RR[datTabNatCarr$vaccName=="PCV-13" & datTabNatCarr$natCarr=="No natural carriage"]<-format(nsmall=1,round(digits=2,datTabNatCarr$prop[datTabNatCarr$vaccName=="PCV-13" & datTabNatCarr$natCarr=="No natural carriage"]/datTabNatCarr$prop[datTabNatCarr$vaccName=="Saline" & datTabNatCarr$natCarr=="No natural carriage"]))

datTabNatCarr$RR[datTabNatCarr$vaccName=="PCV-13" & datTabNatCarr$natCarr=="Natural carriage"]<-format(nsmall=1,round(digits=2,datTabNatCarr$prop[datTabNatCarr$vaccName=="PCV-13" & datTabNatCarr$natCarr=="Natural carriage"]/datTabNatCarr$prop[datTabNatCarr$vaccName=="Saline" & datTabNatCarr$natCarr=="Natural carriage"]))

datTabNatCarr<-datTabNatCarr %>%
  dplyr::select(-prop)

write.csv(datTabNatCarr,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_ManuscriptTable4_EffectOfNaturalCarriage.csv"))

options(knitr.kable.NA = '')

datTabNatCarr %>%
  dplyr::select(-natCarr) %>%
  kable(row.names=F,col.names=c("Study arm","SPN6B carriers","Total participants","Percentage of carriers\n(95% CI)","RR"),caption="Tabulated counts and percentages of experimental carriers stratified by randomisation arm and natural carriage (with risk ratios for experimental carriage). Natural carriage is defined here as being a carrier of a non-6B strain at visit C. Risk ratios are for experimental carriage, comparing the PCV-13 study arm to the saline arm.") %>%
  kable_styling(full_width = FALSE) %>%
  pack_rows(start_row = 1,end_row = 2,group_label = "No natural carriage") %>%
  pack_rows(start_row = 3,end_row = 4,group_label = "Natural carriage")
Table 4.6: Tabulated counts and percentages of experimental carriers stratified by randomisation arm and natural carriage (with risk ratios for experimental carriage). Natural carriage is defined here as being a carrier of a non-6B strain at visit C. Risk ratios are for experimental carriage, comparing the PCV-13 study arm to the saline arm.
Study arm SPN6B carriers Total participants Percentage of carriers (95% CI) RR
No natural carriage
Saline 21 81 25.9% (16.8%, 36.9%)
PCV-13 6 75 8.0% ( 3.0%, 16.6%) 0.31
Natural carriage
Saline 9 25 36.0% (18.0%, 57.5%)
PCV-13 4 23 17.4% ( 5.0%, 38.8%) 0.48

While it does seem like there are differential rates of experimental carriage according to natural carriage within the PCV-13 vaccinated group, overall the vaccine still leads to lower carriage in both natural carriers and those who do not have natural carriage.

We can in fact, stratify the main trial analysis by natural carriage status (defined as either natural carriage at Visit C, E, f or G or natural carriage at Visit C only) - and get the same conclusion as the main analysis: the PCV-13 vaccine protects against experimental carriage. We note however that this result is weakened, and in fact not statistically significant, in the natural carrier groups (whether this is defined as natural carrier at any visit or just at Visit C). This is, however, most probably not just a result of a weaker effect but is likely to be primarily driven by a lower sample size as most participants did not develop natural carriage.

simDat_nc1a<-simDat %>%
  dplyr::filter(carriageNot6B==0)
mod_nc1a<-logbin::logbin(carriage~VaccName+doseGroup,data=simDat_nc1a)

simDat_nc1b<-simDat %>%
  dplyr::filter(carriageNot6B==1)
mod_nc1b<-logbin::logbin(carriage~VaccName+doseGroup,data=simDat_nc1b)

modTmp<-rbind(summary(mod_nc1a)$coefficients,summary(mod_nc1b)$coefficients)
modRes<-modTmp %>%
  as.data.frame() %>%
  dplyr::mutate(
    parameter=case_when(
      rownames(modTmp)=="(Intercept)"~"(Intercept)",
      rownames(modTmp)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(modTmp)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(modTmp)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  dplyr::mutate(Estimate=exp(Estimate)) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption="Summary of the log-binomial regression model fits, stratified by natural carriage status (at any of Visits C, E, F or G). The 'Estimate' column in the table below shows the baseline risk on the '(Intercept)' row and the relative risk associated with each of the exposures on the subsequent rows.",digits=3) %>%
  kable_styling(full_width = FALSE) %>%
  pack_rows(start_row=1,end_row=4,group_label="No natural carriage at any of Visits C, E, F, or G.") %>%
  pack_rows(start_row=5,end_row=8,group_label="Natural carriage for at least one of Visits C, E, F, or G.") 
Table 4.7: Summary of the log-binomial regression model fits, stratified by natural carriage status (at any of Visits C, E, F or G). The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.
Estimate Std. error Z statistic p-value
No natural carriage at any of Visits C, E, F, or G.
(Intercept) 0.386 0.234 -4.064 0.000
PCV-13 vaccine 0.294 0.459 -2.666 0.008
Dose: CFU 80,000 0.907 0.337 -0.288 0.773
Dose: CFU 20,000 0.172 0.992 -1.776 0.076
Natural carriage for at least one of Visits C, E, F, or G.
(Intercept) 0.243 0.370 -3.825 0.000
PCV-13 vaccine 0.531 0.497 -1.273 0.203
Dose: CFU 80,000 1.207 0.470 0.401 0.689
Dose: CFU 20,000 0.341 1.016 -1.060 0.289
simDat_nc2a<-simDat %>%
  dplyr::filter(carriageNot6BVisitC==0)
mod_nc2a<-logbin::logbin(carriage~VaccName+doseGroup,data=simDat_nc2a)

simDat_nc2b<-simDat %>%
  dplyr::filter(carriageNot6BVisitC==1)
mod_nc2b<-logbin::logbin(carriage~VaccName+doseGroup,data=simDat_nc2b)

modTmp<-rbind(summary(mod_nc2a)$coefficients,summary(mod_nc2b)$coefficients)
modRes<-modTmp %>%
  as.data.frame() %>%
  dplyr::mutate(
    parameter=case_when(
      rownames(modTmp)=="(Intercept)"~"(Intercept)",
      rownames(modTmp)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(modTmp)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(modTmp)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  dplyr::mutate(Estimate=exp(Estimate)) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption="Summary of the log-binomial regression model fits, stratified by natural carriage status (at Visit C). The 'Estimate' column in the table below shows the baseline risk on the '(Intercept)' row and the relative risk associated with each of the exposures on the subsequent rows.",digits=3) %>%
  kable_styling(full_width = FALSE) %>%
  pack_rows(start_row=1,end_row=4,group_label="No natural carriage at Visit C.") %>%
  pack_rows(start_row=5,end_row=8,group_label="Natural carriage at Visit C.") 
Table 4.8: Summary of the log-binomial regression model fits, stratified by natural carriage status (at Visit C). The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.
Estimate Std. error Z statistic p-value
No natural carriage at Visit C.
(Intercept) 0.331 0.237 -4.661 0.000
PCV-13 vaccine 0.307 0.429 -2.751 0.006
Dose: CFU 80,000 0.879 0.337 -0.382 0.702
Dose: CFU 20,000 0.148 1.000 -1.911 0.056
Natural carriage at Visit C.
(Intercept) 0.323 0.371 -3.047 0.002
PCV-13 vaccine 0.546 0.520 -1.166 0.244
Dose: CFU 80,000 1.434 0.468 0.771 0.441
Dose: CFU 20,000 0.516 0.985 -0.672 0.502

It is important to note that we are not powered to do a formal statistical comparison of experimental carriage proportion between these groups. We will still do this, but results are to be properly caveated given that this was not powered for.

mod_NC1<-glm(carriage~carriageNot6B*VaccName+doseGroup,data=simDat,family=binomial("log"))

modRes<-summary(mod_NC1)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(mod_NC1)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(mod_NC1)$coefficients)=="carriageNot6B"~"Natural carriage (any visit)",
      rownames(summary(mod_NC1)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(mod_NC1)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(mod_NC1)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
      rownames(summary(mod_NC1)$coefficients)=="carriageNot6B:VaccNamePCV-13"~"Change in effect of PCV-13 vaccine due to natural carriage (any visit)",
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

pGlm<-summary(mod_NC1)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(mod_NC1)$coefficients["VaccNamePCV-13","Estimate"],confint(mod)["VaccNamePCV-13",]))

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  dplyr::mutate(Estimate=exp(Estimate)) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit, taking natural carriage (any visit) into account. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). While natural carriage increases the risk of experimental carriage and the effect of the PCV-13 vaccine is lessened in natural carriers, neither the direct effect of natural carriage or the interaction effect with the vaccine intervention are statistically significant. The 'Estimate' column in the table below shows the baseline risk on the '(Intercept)' row and the relative risk associated with each of the exposures on the subsequent rows."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.9: Summary of the log-binomial regression model fit, taking natural carriage (any visit) into account. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 36%, 95% CI (23.6%, 56.7%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.30, 95% CI (0.20, 0.72), p = 0.008). While natural carriage increases the risk of experimental carriage and the effect of the PCV-13 vaccine is lessened in natural carriers, neither the direct effect of natural carriage or the interaction effect with the vaccine intervention are statistically significant. The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.
Estimate Std. error Z statistic p-value
(Intercept) 0.365 0.224 -4.498 0.000
Natural carriage (any visit) 0.738 0.327 -0.927 0.354
PCV-13 vaccine 0.296 0.459 -2.654 0.008
Dose: CFU 80,000 1.000 0.274 0.001 0.999
Dose: CFU 20,000 0.227 0.706 -2.097 0.036
Change in effect of PCV-13 vaccine due to natural carriage (any visit) 1.790 0.676 0.862 0.389
mod_NC2<-glm(carriage~carriageNot6BVisitC*VaccName+doseGroup,data=simDat,family=binomial("log"))

modRes<-summary(mod_NC2)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(mod_NC2)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(mod_NC2)$coefficients)=="carriageNot6BVisitC"~"Natural carriage (Visit C)",
      rownames(summary(mod_NC2)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(mod_NC2)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(mod_NC2)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
      rownames(summary(mod_NC2)$coefficients)=="carriageNot6BVisitC:VaccNamePCV-13"~"Change in effect of PCV-13 vaccine due to natural carriage (Visit C)",
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

pGlm<-summary(mod_NC2)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(mod_NC2)$coefficients["VaccNamePCV-13","Estimate"],confint(mod)["VaccNamePCV-13",]))

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  dplyr::mutate(Estimate=exp(Estimate)) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit, taking natural carriage (at Visit C) into account. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). While natural carriage increases the risk of experimental carriage and the effect of the PCV-13 vaccine is lessened in natural carriers, neither the direct effect of natural carriage or the interaction effect with the vaccine intervention are statistically significant. The 'Estimate' column in the table below shows the baseline risk on the '(Intercept)' row and the relative risk associated with each of the exposures on the subsequent rows."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.10: Summary of the log-binomial regression model fit, taking natural carriage (at Visit C) into account. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 30%, 95% CI (19.0%, 47.1%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.31, 95% CI (0.20, 0.72), p = 0.007). While natural carriage increases the risk of experimental carriage and the effect of the PCV-13 vaccine is lessened in natural carriers, neither the direct effect of natural carriage or the interaction effect with the vaccine intervention are statistically significant. The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.
Estimate Std. error Z statistic p-value
(Intercept) 0.299 0.232 -5.210 0.000
Natural carriage (Visit C) 1.270 0.324 0.739 0.460
PCV-13 vaccine 0.312 0.430 -2.714 0.007
Dose: CFU 80,000 1.042 0.274 0.152 0.879
Dose: CFU 20,000 0.236 0.705 -2.047 0.041
Change in effect of PCV-13 vaccine due to natural carriage (Visit C) 1.831 0.671 0.901 0.367

4.2.2 LytA PCR densities (6B)

4.2.2.1 LytA PCR derived 6B carriage

The LytA density data is more up to date than the LytA carriage data. But since densities have been computed for all LytA positive samples, we can also compare culture derived and LytA PCR derived carriage results.

datLytADensity6B<-read.csv(fileLytADensity6B) %>%
  dplyr::mutate(
    pid=datFollowUp$pid[match(SampleID,datFollowUp$wash)],
    visit=datFollowUp$visit[match(SampleID,datFollowUp$wash)]
  )

datLytADensity6BVisitE<-datLytADensity6B %>% dplyr::filter(visit=="E")
datLytADensity6BVisitF<-datLytADensity6B %>% dplyr::filter(visit=="F")
datLytADensity6BVisitG<-datLytADensity6B %>% dplyr::filter(visit=="G")

simDat<-simDat %>%
  dplyr::mutate(
    carriage6BLytAVisitE=ifelse(pid %in% datLytADensity6B$pid[datLytADensity6B$visit=="E"],1,0),
    carriage6BLytAVisitF=ifelse(pid %in% datLytADensity6B$pid[datLytADensity6B$visit=="F"],1,0),
    carriage6BLytAVisitG=ifelse(pid %in% datLytADensity6B$pid[datLytADensity6B$visit=="G"],1,0),
    density6BLytAVisitE=datLytADensity6BVisitE$lytA_FAM_density_CopiesPerMlNw[match(pid,datLytADensity6BVisitE$pid)],
    density6BLytAVisitF=datLytADensity6BVisitF$lytA_FAM_density_CopiesPerMlNw[match(pid,datLytADensity6BVisitF$pid)],
    density6BLytAVisitG=datLytADensity6BVisitG$lytA_FAM_density_CopiesPerMlNw[match(pid,datLytADensity6BVisitG$pid)]
  ) %>%
  dplyr::mutate(
    carriage6BLytA=ifelse(carriage6BLytAVisitE==1 | carriage6BLytAVisitF==1 | carriage6BLytAVisitG==1,1,0)
  )

cultPostPCRnegVisitE<-simDat$pid[simDat$carriageVisitE==1 & !is.na(simDat$carriage6BLytAVisitE) & simDat$carriage6BLytAVisitE==0]
cultPostPCRnegVisitF<-simDat$pid[simDat$carriageVisitF==1 & !is.na(simDat$carriage6BLytAVisitF) & simDat$carriage6BLytAVisitF==0]
cultPostPCRnegVisitG<-simDat$pid[simDat$carriageVisitG==1 & !is.na(simDat$carriage6BLytAVisitG) & simDat$carriage6BLytAVisitG==0]

cultPosPcrNeg<-data.frame(
  pid=c(
    simDat$pid[simDat$carriageVisitE==1 & !is.na(simDat$carriage6BLytAVisitE) & simDat$carriage6BLytAVisitE==0],
    simDat$pid[simDat$carriageVisitF==1 & !is.na(simDat$carriage6BLytAVisitF) & simDat$carriage6BLytAVisitF==0],
    simDat$pid[simDat$carriageVisitG==1 & !is.na(simDat$carriage6BLytAVisitG) & simDat$carriage6BLytAVisitG==0]),
  visit=c(
    rep("E",sum(simDat$carriageVisitE==1 & !is.na(simDat$carriage6BLytAVisitE) & simDat$carriage6BLytAVisitE==0)),
    rep("F",sum(simDat$carriageVisitF==1 & !is.na(simDat$carriage6BLytAVisitF) & simDat$carriage6BLytAVisitF==0)),
    rep("G",sum(simDat$carriageVisitG==1 & !is.na(simDat$carriage6BLytAVisitG) & simDat$carriage6BLytAVisitG==0))
  )  
)

dfE<-table(simDat$carriageVisitE,simDat$carriage6BLytAVisitE,useNA="always")[1:2,]
dfF<-table(simDat$carriageVisitF,simDat$carriage6BLytAVisitF,useNA="always")[1:2,]
dfG<-table(simDat$carriageVisitG,simDat$carriage6BLytAVisitG,useNA="always")[1:2,]

dfE<-data.frame(
  Culture6B=c("Culture negative (6B)","Culture positive (6B)"),
  LytA6BNegative=dfE[,1],
  LytA6BPositive=dfE[,2],
  LytA6BUnknown=dfE[,3]
)

dfF<-data.frame(
  Culture6B=c("Culture negative (6B)","Culture positive (6B)"),
  LytA6BNegative=dfF[,1],
  LytA6BPositive=dfF[,2],
  LytA6BUnknown=dfF[,3]
)

dfG<-data.frame(
  Culture6B=c("Culture negative (6B)","Culture positive (6B)"),
  LytA6BNegative=dfG[,1],
  LytA6BPositive=dfG[,2],
  LytA6BUnknown=dfG[,3]
)
dfE %>%
  kable(col.names=c("","Negative","Positive","Unknown"),row.names = FALSE,caption=paste(sep="","Culture vs LytA PCR results for 6B carriage at visit E.\nThese are the culture positive, LytA PCR negative samples: < ",paste(collapse=", ",cultPostPCRnegVisitE)," >.\nThere are no samples with no LytA PCR result.")) %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(c(" "=1,"LytA PCR"=3))
Table 4.11: Culture vs LytA PCR results for 6B carriage at visit E. These are the culture positive, LytA PCR negative samples: . There are no samples with no LytA PCR result.
LytA PCR
Negative Positive Unknown
Culture negative (6B) 164 10 0
Culture positive (6B) 1 29 0
dfF %>%
  kable(col.names=c("","Negative","Positive","Unknown"),row.names = FALSE,caption=paste(sep="","Culture vs LytA PCR results for 6B carriage at visit F.\nThese are the culture positive, LytA PCR negative samples: < ",paste(collapse=", ",cultPostPCRnegVisitF)," >.\nThere are no samples with no LytA PCR result.")) %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(c(" "=1,"LytA PCR"=3))
Table 4.12: Culture vs LytA PCR results for 6B carriage at visit F. These are the culture positive, LytA PCR negative samples: . There are no samples with no LytA PCR result.
LytA PCR
Negative Positive Unknown
Culture negative (6B) 172 8 0
Culture positive (6B) 3 21 0
dfG %>%
  kable(col.names=c("","Negative","Positive","Unknown"),row.names = FALSE,caption=paste(sep="","Culture vs LytA PCR results for 6B carriage at visit G.\nThese are the culture positive, LytA PCR negative samples: < ",paste(collapse=", ",cultPostPCRnegVisitG)," >.\nThere are no samples with no LytA PCR result.")) %>%
  kable_styling(full_width = FALSE) %>%
  add_header_above(c(" "=1,"LytA PCR"=3))
Table 4.13: Culture vs LytA PCR results for 6B carriage at visit G. These are the culture positive, LytA PCR negative samples: . There are no samples with no LytA PCR result.
LytA PCR
Negative Positive Unknown
Culture negative (6B) 175 6 0
Culture positive (6B) 4 19 0

Overall, using the primary outcome definition of carriage (carriage at any the 3 post-inoculation study visits), across all inoculation doses and both study arms, there are 204 participants, of which 40 are culture positive for carriage (19.6%, 95% CI (14.4%, 25.7%)), while there are 52 participants who are LytA PCR positive for carriage (25.5%, 95% CI (19.7%, 32.0%)).

We can break this down by study visit, inoculation dose and study arm:

datCarrStudyArmDoseVisitCultVsLytA<-simDat %>%
  dplyr::select(c(pid,VaccName,doseGroup,carriageVisitE,carriageVisitF,carriageVisitG,carriage,carriage6BLytAVisitE,carriage6BLytAVisitF,carriage6BLytAVisitG,carriage6BLytA)) %>%
  dplyr::rename(carriageVisitAny=carriage,carriage6BLytAVisitAny=carriage6BLytA) %>%
  dplyr::mutate(doseGroup=factor(doseGroup,levels=c("CFU 20000","CFU 80000","CFU 160000"))) %>%
  tidyr::pivot_longer(cols=contains("carriage"),names_pattern="(.*)Visit(.*)",names_to=c("Method","Visit")) %>%
  tidyr::pivot_wider(id_cols = c(pid,VaccName,doseGroup,Visit),names_from="Method",values_from="value") %>%
  dplyr::rename(carriage6B=carriage) %>%
  dplyr::group_by(Visit,VaccName,doseGroup) %>%
  dplyr::summarise(
    n=n(),
    k_cult=sum(carriage6B),
    prop_cult=mean(carriage6B),
    k_lyta=sum(carriage6BLytA),
    prop_lyta=mean(carriage6BLytA),
    .groups="drop"
  ) %>%
  dplyr::mutate(
    VaccName=forcats::fct_recode(VaccName,"Saline"="Saline","PCV13"="PCV-13"),
    doseGroup=forcats::fct_recode(doseGroup,"CFU 20,000"="CFU 20000","CFU 80,000"="CFU 80000","CFU 160,000"="CFU 160000"),
    prop_cult_CILow=NA,
    prop_cult_CIUpp=NA,
    prop_cult_Formatted=NA,
    prop_lyta_CILow=NA,
    prop_lyta_CIUpp=NA,
    prop_lyta_Formatted=NA
  )

datCarrStudyArmDoseVisitCultVsLytA$Visit[datCarrStudyArmDoseVisitCultVsLytA$Visit=="Any"]<-"Any day"
datCarrStudyArmDoseVisitCultVsLytA$Visit[datCarrStudyArmDoseVisitCultVsLytA$Visit=="E"]<-"Visit E"
datCarrStudyArmDoseVisitCultVsLytA$Visit[datCarrStudyArmDoseVisitCultVsLytA$Visit=="F"]<-"Visit F"
datCarrStudyArmDoseVisitCultVsLytA$Visit[datCarrStudyArmDoseVisitCultVsLytA$Visit=="G"]<-"Visit G"
datCarrStudyArmDoseVisitCultVsLytA$Visit<-factor(datCarrStudyArmDoseVisitCultVsLytA$Visit,levels=c("Visit E","Visit F","Visit G","Any day"))
datCarrStudyArmDoseVisitCultVsLytA<-datCarrStudyArmDoseVisitCultVsLytA %>%
  dplyr::arrange(Visit)

for(i in 1:nrow(datCarrStudyArmDoseVisitCultVsLytA)){
  tmp<-binom.test(x=datCarrStudyArmDoseVisitCultVsLytA$k_cult[i],n=datCarrStudyArmDoseVisitCultVsLytA$n[i])
  datCarrStudyArmDoseVisitCultVsLytA$prop_cult_CILow[i]<-tmp$conf.int[1]
  datCarrStudyArmDoseVisitCultVsLytA$prop_cult_CIUpp[i]<-tmp$conf.int[2]
  datCarrStudyArmDoseVisitCultVsLytA$prop_cult_Formatted[i]<-paste(sep="",format(nsmall=1,round(digits=1,100*datCarrStudyArmDoseVisitCultVsLytA$prop_cult[i])),"% (",paste(collapse=",",format(nsmall=1,round(digits=1,100*c(datCarrStudyArmDoseVisitCultVsLytA$prop_cult_CILow[i],datCarrStudyArmDoseVisitCultVsLytA$prop_cult_CIUpp[i])))),")")
  tmp<-binom.test(x=datCarrStudyArmDoseVisitCultVsLytA$k_lyta[i],n=datCarrStudyArmDoseVisitCultVsLytA$n[i])
  datCarrStudyArmDoseVisitCultVsLytA$prop_lyta_CILow[i]<-tmp$conf.int[1]
  datCarrStudyArmDoseVisitCultVsLytA$prop_lyta_CIUpp[i]<-tmp$conf.int[2]
  datCarrStudyArmDoseVisitCultVsLytA$prop_lyta_Formatted[i]<-paste(sep="",format(nsmall=1,round(digits=1,100*datCarrStudyArmDoseVisitCultVsLytA$prop_lyta[i])),"% (",paste(collapse=",",format(nsmall=1,round(digits=1,100*c(datCarrStudyArmDoseVisitCultVsLytA$prop_lyta_CILow[i],datCarrStudyArmDoseVisitCultVsLytA$prop_lyta_CIUpp[i])))),")")
}

datCarrStudyArmDoseVisitCultVsLytAKable<-datCarrStudyArmDoseVisitCultVsLytA %>%
  dplyr::select(Visit,VaccName,doseGroup,n,k_cult,prop_cult_Formatted,k_lyta,prop_lyta_Formatted) %>%
  tidyr::pivot_wider(id_cols = c(Visit,doseGroup),names_from=VaccName,values_from = c(n,k_cult,prop_cult_Formatted,k_lyta,prop_lyta_Formatted)) %>%
  dplyr::select(c(Visit,doseGroup,contains("Saline"),contains("PCV13")))

datCarrStudyArmDoseVisitCultVsLytAKable %>%
  dplyr::select(-Visit) %>%
  knitr::kable(row.names=F,col.names=c("","n","k","proportion (95% CI)","k","proportion (95% CI)","n","k","proportion (95% CI)","k","proportion (95% CI)"),caption="Comparison of culture and LytA PCR derived carriage status.") %>%
  kableExtra::kable_styling(full_width = FALSE) %>%
  kableExtra::pack_rows(start_row=1,end_row=3,group_label="Day 2") %>%
  kableExtra::pack_rows(start_row=4,end_row=6,group_label="Day 7") %>%
  kableExtra::pack_rows(start_row=7,end_row=9,group_label="Day 14") %>%
  kableExtra::pack_rows(start_row=10,end_row=12,group_label="Any day") %>%
  kableExtra::add_header_above(c(" "=2,"Culture"=2,"LytA"=2," "=1,"Culture"=2,"LytA"=2)) %>%
  kableExtra::add_header_above(c(" "=1,"Saline"=5,"PCV-13"=5))
Table 4.14: Comparison of culture and LytA PCR derived carriage status.
Saline
PCV-13
Culture
LytA
Culture
LytA
n k proportion (95% CI) k proportion (95% CI) n k proportion (95% CI) k proportion (95% CI)
Day 2
CFU 20,000 19 2 10.5% ( 1.3,33.1) 2 10.5% ( 1.3,33.1) 21 0 0.0% ( 0.0,16.1) 0 0.0% ( 0.0,16.1)
CFU 80,000 41 11 26.8% (14.2,42.9) 16 39.0% (24.2,55.5) 33 4 12.1% ( 3.4,28.2) 5 15.2% ( 5.1,31.9)
CFU 160,000 46 9 19.6% ( 9.4,33.9) 11 23.9% (12.6,38.8) 44 4 9.1% ( 2.5,21.7) 5 11.4% ( 3.8,24.6)
Day 7
CFU 20,000 19 1 5.3% ( 0.1,26.0) 3 15.8% ( 3.4,39.6) 21 0 0.0% ( 0.0,16.1) 0 0.0% ( 0.0,16.1)
CFU 80,000 41 8 19.5% ( 8.8,34.9) 9 22.0% (10.6,37.6) 33 3 9.1% ( 1.9,24.3) 3 9.1% ( 1.9,24.3)
CFU 160,000 46 9 19.6% ( 9.4,33.9) 10 21.7% (10.9,36.4) 44 3 6.8% ( 1.4,18.7) 4 9.1% ( 2.5,21.7)
Day 14
CFU 20,000 19 1 5.3% ( 0.1,26.0) 2 10.5% ( 1.3,33.1) 21 0 0.0% ( 0.0,16.1) 0 0.0% ( 0.0,16.1)
CFU 80,000 41 8 19.5% ( 8.8,34.9) 6 14.6% ( 5.6,29.2) 33 1 3.0% ( 0.1,15.8) 1 3.0% ( 0.1,15.8)
CFU 160,000 46 10 21.7% (10.9,36.4) 11 23.9% (12.6,38.8) 44 3 6.8% ( 1.4,18.7) 5 11.4% ( 3.8,24.6)
Any day
CFU 20,000 19 2 10.5% ( 1.3,33.1) 4 21.1% ( 6.1,45.6) 21 0 0.0% ( 0.0,16.1) 0 0.0% ( 0.0,16.1)
CFU 80,000 41 12 29.3% (16.1,45.5) 16 39.0% (24.2,55.5) 33 6 18.2% ( 7.0,35.5) 6 18.2% ( 7.0,35.5)
CFU 160,000 46 16 34.8% (21.4,50.2) 19 41.3% (27.0,56.8) 44 4 9.1% ( 2.5,21.7) 7 15.9% ( 6.6,30.1)

We can also repeat the primary analyses using the LytA PCR data.

Below we display a set of 2x2 tables (one for each inoculation dose and one for the data from all doses combined), stacked on top of each other, summarising the LytA PCR carriage results. While we report the p-values from exact Fisher tests for each 2x2 table, the primary analysis and conclusion are based on the p-value from the log-binomial regression as detailed above (see Table 4.16).

simDat<-simDat %>%
  mutate(doseGroup=factor(paste(sep=" ","CFU",dose),levels=c("CFU 160000","CFU 80000","CFU 20000")))

mod<-logbin::logbin(carriage6BLytA~VaccName+doseGroup,data=simDat)

simDat20<-simDat %>% dplyr::filter(dose==20000)
simDat80<-simDat %>% dplyr::filter(dose==80000)
simDat160<-simDat %>% dplyr::filter(dose==160000)

tab20<-table(simDat20$VaccName,simDat20$carriage6BLytA)
tab80<-table(simDat80$VaccName,simDat80$carriage6BLytA)
tab160<-table(simDat160$VaccName,simDat160$carriage6BLytA)
tabAll<-table(simDat$VaccName,simDat$carriage6BLytA)
tabFull<-rbind(tab20,tab80,tab160,tabAll)

pFish20<-fisher.test(tab20)$p.value
pFish20<-ifelse(pFish20>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish20)))," < 0.001")
pFish80<-fisher.test(tab80)$p.value
pFish80<-ifelse(pFish80>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish80)))," < 0.001")
pFish160<-fisher.test(tab160)$p.value
pFish160<-ifelse(pFish160>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFish160)))," < 0.001")
pFishAll<-fisher.test(tabAll)$p.value
pFishAll<-ifelse(pFishAll>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishAll)))," < 0.001")

pGlm<-summary(mod)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(mod)$coefficients["VaccNamePCV-13","Estimate"],confint(mod)["VaccNamePCV-13",]))

tabFull %>%
  kable(caption=paste(sep="","Number of participants colonised with experimental S. Pneumoniae serotype 6B (as determined by LytA PCR) by study arm and inoculation dose. According to a log-binomial regression model which also adjusts for inoculation dose, there is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of LytA PCR derived serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),col.names=c("Not colonised","Colonised"),row.names=T) %>%
  kable_styling(full_width = F) %>%
  pack_rows(paste(sep="","CFU 20,000 (Fisher test p",pFish20,")"),1,2) %>%
  pack_rows(paste(sep="","CFU 80,000 (Fisher test p",pFish80,")"),3,4) %>%
  pack_rows(paste(sep="","CFU 160,000 (Fisher test p",pFish160,")"),5,6) %>%
  pack_rows(paste(sep="","All doses (Fisher test p",pFishAll,")"),7,8)
Table 4.15: Number of participants colonised with experimental S. Pneumoniae serotype 6B (as determined by LytA PCR) by study arm and inoculation dose. According to a log-binomial regression model which also adjusts for inoculation dose, there is a statistically significant effect of PCV-13 vaccine on the probability of LytA PCR derived serotype 6B carriage (RR 0.38, 95% CI (0.22, 0.66), p
Not colonised Colonised
CFU 20,000 (Fisher test p = 0.042)
Saline 15 4
PCV-13 21 0
CFU 80,000 (Fisher test p = 0.073)
Saline 25 16
PCV-13 27 6
CFU 160,000 (Fisher test p = 0.010)
Saline 27 19
PCV-13 37 7
All doses (Fisher test p < 0.001)
Saline 67 39
PCV-13 85 13
tabFullAugmented<-data.frame(
  DoseGroup=c(rep("CFU 20,000",2),rep("CFU 80,000",2),rep("CFU 160,000",2),rep("All",2)),
  StudyArm=rownames(tabFull),
  carriageNegative=tabFull[,1],
  carriageNegative=tabFull[,2],
  p.value=c(pFish20,NA,pFish80,NA,pFish160,NA,pFishAll,NA)
)

write.csv(tabFullAugmented,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmInoculationDose_LytA.csv"))

We also summarise the log-binomial model fit, reporting the estimated coefficients, their standard errors and associated p-values.

modRes<-summary(mod)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(mod)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(mod)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(mod)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(mod)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  dplyr::mutate(Estimate=exp(Estimate)) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit (for the LytA PCR derived carriage outcome). The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). The 'Estimate' column in the table below shows the baseline risk on the '(Intercept)' row and the relative risk associated with each of the exposures on the subsequent rows."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.16: Summary of the log-binomial regression model fit (for the LytA PCR derived carriage outcome). The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 42%, 95% CI (29.9%, 57.5%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.38, 95% CI (0.22, 0.66), p
Estimate Std. error Z statistic p-value
(Intercept) 0.415 0.167 -5.279 0.000
PCV-13 vaccine 0.376 0.284 -3.439 0.001
Dose: CFU 80,000 0.980 0.232 -0.089 0.929
Dose: CFU 20,000 0.369 0.486 -2.048 0.041
tt<-modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  dplyr::mutate(Estimate=exp(Estimate))

write.csv(tt,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_ManuscriptTable3_carriageLogBinomialPrimaryAnalysisModelSummary_LytA.csv"))

We summarise the carriage data in Figure 4.5.

plotDat<-simDat %>%
  group_by(vaccInoDose,VaccName,dose) %>%
  summarise(
    n=n(),
    k=sum(carriage6BLytA),
    .groups="drop") %>%
  mutate(
    prop=k/n,
    propLow=NA,
    propUpp=NA
  )

plotDatTmp<-simDat %>%
  group_by(VaccName) %>%
  summarise(
    n=n(),
    k=sum(carriage6BLytA),
    .groups="drop") %>%
  mutate(
    prop=k/n,
    propLow=NA,
    propUpp=NA
  ) %>%
  mutate(
    dose="(any dose)",
    vaccInoDose=paste(sep=" ",VaccName,"(any dose)")
  )

plotDat<-rbind(plotDat,plotDatTmp) %>%
  dplyr::select(vaccInoDose,VaccName,dose,n,k,prop,propLow,propUpp)

for(j in 1:nrow(plotDat)){
  ci<-binom.test(x=plotDat$k[j],n=plotDat$n[j])$conf.int
  plotDat$propLow[j]<-ci[1]
  plotDat$propUpp[j]<-ci[2] 
}

rm(ci,plotDatTmp)

plotDat$dose<-fct_recode(factor(plotDat$dose,levels=c("20000","80000","160000","(any dose)")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","(any dose)"="(any dose)")
plotDat<-plotDat[order(plotDat$VaccName,plotDat$dose),]

plotDat<-plotDat %>%
  mutate(col=c(blueCols,orangeCols))

plotDat$vaccInoDose<-factor(plotDat$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))

yMax<-max(c(plotDat$propUpp))

g1<-plotDat %>%
  filter(VaccName=="Saline") %>%
  ggplot(mapping=aes(x=dose,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9) +
  geom_errorbar(width=0.2,col="grey") +
  geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c(rgb(70+40,130+40,180+40,maxColorValue=255),rgb(70,130,180,maxColorValue=255),rgb(70-40,130-40,180-40,maxColorValue=255),rgb(70-70,130-100,180-100,maxColorValue=255))) +
  xlab("Study arm & inoculation dose") +
  ylab("Carriage (LytA PCR) proportion") +
  guides(fill="none") +
  theme_light() +
  labs(title="Saline") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=3.5,col="darkgrey",lwd=1.25,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),text=element_text(size=16))

g2<-plotDat %>%
  filter(VaccName=="PCV-13") %>%
  ggplot(mapping=aes(x=dose,y=prop,ymin=propLow,ymax=propUpp,fill=dose,label=paste(sep=" / ",k,n))) +
  geom_bar(stat="identity",alpha=0.9) +
  geom_errorbar(width=0.2,col="grey") +
  geom_text(y=-0.012,col="grey50",fontface = "bold") +
  scale_fill_manual(values=c(rgb(245+10,165+40,0,maxColorValue=255),rgb(245,165,0,maxColorValue=255),rgb(245-10,165-40,0,maxColorValue=255),rgb(245-30,165-120,0,maxColorValue=255))) +
  xlab("Study arm & inoculation dose") +
  ylab("Carriage (LytA PCR) proportion") +
  guides(fill="none") +
  theme_light() +
  labs(title="PCV-13") +
  ylim(c(0,yMax)) +
  geom_vline(xintercept=3.5,col="darkgrey",lwd=1.5,lty=2) +
  theme(axis.text.x = element_text(angle = 45, hjust=1),text=element_text(size=16))

pFish20<-as.numeric(gsub(pattern=" = ",replacement="",pFish20))
pFish80<-as.numeric(gsub(pattern=" = ",replacement="",pFish80))
pFish160<-as.numeric(gsub(pattern=" = ",replacement="",pFish160))
pFishAll<-as.numeric(gsub(pattern=" = ",replacement="",pFishAll))

figCap<-paste(sep="","Carriers out of total participants are indicated below each bar.\nLog-binomial regression p-value for PCV-13 vaccination coefficient, p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),".\nFisher's exact test p-value (overall carriage) ",ifelse(pFishAll>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pFishAll))),"<0.001"),".\nFisher's exact test p-value at doses 20,000, 80,000 and 160,000 CFU = ",paste(collapse=", ",format(nsmall=3,round(digits=3,c(pFish20,pFish80,pFish160)))),".")
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, x = unit(1, "npc"), just="right", gp=gpar(fontsize=10)))
LytA PCR derived Serotype 6B carriage proportions per study arm and inoculation dose.

Figure 4.5: LytA PCR derived Serotype 6B carriage proportions per study arm and inoculation dose.

pdf(width=12,height=8,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmInoculationDose_LytA.pdf"))
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, x = unit(1, "npc"), just="right", gp=gpar(fontsize=10)))
dev.off()
## quartz_off_screen 
##                 2
png(width=12,height=8,res=450,units="in",file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_carriageByStudyArmInoculationDose_LytA.png"))
grid.arrange(g1,g2,nrow=1,bottom=grid::textGrob(figCap, x = unit(1, "npc"), just="right", gp=gpar(fontsize=10)))
dev.off()
## quartz_off_screen 
##                 2

4.2.2.2 LytA PCR density data

densDf<-simDat %>%
  dplyr::select(pid,doseGroup,VaccName,density6BVisitE,density6BVisitF,density6BVisitG,density6BLytAVisitE,density6BLytAVisitF,density6BLytAVisitG) %>%
  dplyr::rename(density6BCultureVisitE=density6BVisitE,density6BCultureVisitF=density6BVisitF,density6BCultureVisitG=density6BVisitG) %>%
  tidyr::pivot_longer(cols=contains("density6B"),names_pattern="density6B(.*)Visit(.)",names_to=c("Method","Visit")) %>%
  tidyr::pivot_wider(id_cols = c(pid,doseGroup,VaccName,Visit),names_from="Method",values_from="value") %>%
  dplyr::mutate(cultPosPcrNeg=0)

for(j in 1:nrow(cultPosPcrNeg)){
  idx<-which(densDf$pid==cultPosPcrNeg$pid[j] & densDf$Visit==cultPosPcrNeg$visit[j])
  if(length(idx)==1){
    densDf$cultPosPcrNeg[idx]<-1
  }
}

densDf %>%
  dplyr::filter(Culture>0 & !is.na(LytA)) %>%
  ggplot(mapping=aes(x=Culture,y=LytA,col=Visit,pch=doseGroup)) +
  geom_point(size=2) +
  scale_x_continuous(trans='log',breaks=c(50,2000,80000,3200000)) +
  scale_y_continuous(trans='log',breaks=c(0.5,20,800,32000)) +
  xlab("Culture; log(CFU / mL)") + 
  ylab("LytA; log(copies / mL)") +
  scale_shape_discrete(name="Inoculation dose") +
  theme_light() +
  theme(legend.position="bottom") 
Scatterplot of logged culture and LytA PCR measured carriage densities.

Figure 4.6: Scatterplot of logged culture and LytA PCR measured carriage densities.

We can calculate correlation coefficients for the 2 density measurements. This will be done on the log scale, given the skew distributions of both measurements.

densDf %>%
  dplyr::filter(Culture>0 & !is.na(LytA)) %>%
  dplyr::mutate(
    Culture=log(Culture),
    LytA=log(LytA)
  ) %>%
  dplyr::group_by(Visit) %>%
  dplyr::summarise(
    Pearson=format(nsmall=2,round(digits=2,cor(Culture,LytA,method="pearson"))),
    PearsonCI=paste(sep="","(",paste(collapse=", ",format(nsmall=2,round(digits=2,ci_cor(Culture,LytA,method="pearson",type="normal")$interval))),")"),
    Spearman=format(nsmall=2,round(digits=2,cor(Culture,LytA,method="spearman"))),
    SpearmanCI=paste(sep="","(",paste(collapse=", ",format(nsmall=2,round(digits=2,ci_cor(Culture,LytA,method="spearman",type="bootstrap")$interval))),")")
  ) %>%
  knitr::kable(caption="Pearson and Spearman correlation coefficients calculated for the logged culture and LytA PCR derived density measurements. Only observations with non-zero measurements in both measures used in the calculation. Confidence intervals were obtained using a normal approximation (Pearson) and bootstrap resampling (Spearman).") %>%
  kableExtra::kable_styling(full_width=FALSE)
Table 4.17: Pearson and Spearman correlation coefficients calculated for the logged culture and LytA PCR derived density measurements. Only observations with non-zero measurements in both measures used in the calculation. Confidence intervals were obtained using a normal approximation (Pearson) and bootstrap resampling (Spearman).
Visit Pearson PearsonCI Spearman SpearmanCI
E 0.94 (0.88, 0.98) 0.92 (0.79, 0.98)
F 0.83 (0.56, 0.94) 0.89 (0.64, 0.98)
G 0.70 (0.27, 0.90) 0.71 (0.28, 0.91)

We can inspect the culture derived densities for the culture negative, PCR positive samples. From Figure 4.7 it appears that the culture positive, PCR negative samples are characterised by lower densities but it is important to note that there are samples with lower density that are both culture and PCR positive.

densDf %>%
  dplyr::filter(Culture>=0 | !is.na(LytA)) %>% 
  ggplot(mapping=aes(y=Culture,x=ifelse(cultPosPcrNeg==0,"Culture positive,\nPCR positive","Culture positive,\nPCR negative"))) +
  geom_boxplot(col="gray") +
  geom_jitter(width=0.25,height=0) +
  scale_y_continuous(trans='log') +
  theme_light() +
  xlab("") +
  ylab("Culture; log(CFU / mL)") +
  labs(caption="Note that there are 3 culture positive, PCR negative samples where the corresponding culture derived density is 0.")
Comparison of culture derived density measurements between culture positive, PCR negative and culture positive, PCR positive samples.

Figure 4.7: Comparison of culture derived density measurements between culture positive, PCR negative and culture positive, PCR positive samples.

4.2.3 Secondary analyses

4.2.3.1 Descriptive analyses

We have calculated carriage proportions per study arm at each follow-up visit (visits E, F, G).

plotDatLine<-simDat %>%
  dplyr::select(pid,dose,VaccName,carriageVisitE,carriageVisitF,carriageVisitG) %>%
  pivot_longer(cols=c(carriageVisitE,carriageVisitF,carriageVisitG),names_to="visit",values_to="carriage") %>%
  mutate(visit=gsub(visit,pattern="carriage",replacement="")) %>%
  group_by(VaccName,visit,dose) %>%
  summarise(
    n=sum(!is.na(carriage)),
    k=sum(carriage,na.rm=T),
    .groups="drop") %>%
  mutate(
    prop=k/n,
    propLow=NA,
    propUpp=NA
  ) %>%
  mutate(
    visitNum=case_when(
      visit=="VisitE"~2,
      visit=="VisitF"~7,
      visit=="VisitG"~14,
      TRUE~NA_real_
    ),
    vaccInoDose=factor(paste(sep=" ",VaccName,format(dose,big.mark=",",trim=TRUE)),levels=c("Saline 20,000","Saline 80,000","Saline 160,000","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000"))
  )

for(j in 1:nrow(plotDatLine)){
  ci<-binom.test(x=plotDatLine$k[j],n=plotDatLine$n[j])$conf.int
  plotDatLine$propLow[j]<-ci[1]
  plotDatLine$propUpp[j]<-ci[2]
  rm(ci)
}

plotDatLineAll<-simDat %>%
  dplyr::select(pid,dose,VaccName,carriageVisitE,carriageVisitF,carriageVisitG) %>%
  pivot_longer(cols=c(carriageVisitE,carriageVisitF,carriageVisitG),names_to="visit",values_to="carriage") %>%
  mutate(visit=gsub(visit,pattern="carriage",replacement="")) %>%
  group_by(VaccName,visit) %>%
  summarise(
    n=sum(!is.na(carriage)),
    k=sum(carriage,na.rm=T),
    .groups="drop") %>%
  mutate(
    prop=k/n,
    propLow=NA,
    propUpp=NA
  ) %>%
  mutate(
    visitNum=case_when(
      visit=="VisitE"~2,
      visit=="VisitF"~7,
      visit=="VisitG"~14,
      TRUE~NA_real_
    ),
    vaccInoDose=fct_recode(VaccName,"Saline (any dose)"="Saline","PCV-13 (any dose)"="PCV-13")
  ) %>%
  mutate(dose="All")

for(j in 1:nrow(plotDatLineAll)){
  ci<-binom.test(x=plotDatLineAll$k[j],n=plotDatLineAll$n[j])$conf.int
  plotDatLineAll$propLow[j]<-ci[1]
  plotDatLineAll$propUpp[j]<-ci[2]
  rm(ci)
}

plotDatLine<-rbind(plotDatLine,plotDatLineAll)

plotDatLine$dose<-fct_recode(factor(plotDatLine$dose,levels=c("20000","80000","160000","All")),"20,000 CFU"="20000","80,000 CFU"="80000","160,000 CFU"="160000","Any dose"="All")
plotDatLine$vaccInoDose<-factor(plotDatLine$vaccInoDose,levels=c("Saline 20,000","Saline 80,000","Saline 160,000","Saline (any dose)","PCV-13 20,000","PCV-13 80,000","PCV-13 160,000","PCV-13 (any dose)"))
    
yMax<-max(plotDatLine$propUpp)
                                 
plotDatLine %>%
  ggplot(mapping=aes(x=visitNum,y=prop,ymin=propLow,ymax=propUpp,col=VaccName)) +
  geom_line(lty=2,lwd=1) +
  geom_point(size=3) +
  geom_errorbar(width=0.3,lwd=0.65) +
  scale_color_manual(values=c("steelblue","orange"),name="Study arm") +
  #scale_color_manual(values=c(blueCols,orangeCols),name="Vaccination / inoculation arm") +
  xlab("Study visit (days since inoculation)") +
  ylab("Carriage proportion") +
  xlim(c(0,15)) +
  ylim(c(0,yMax)) +
  guides(fill="none") +
  theme_light() +
  labs(caption="Confidence intervals are exact binomial confidence intervals.\n") +
  facet_wrap("dose")
Serotype 6B carriage proportions per study arm and inoculation dose over time.

Figure 4.8: Serotype 6B carriage proportions per study arm and inoculation dose over time.

4.2.3.2 Carriage density (culture)

Unless stated otherwise, the density measurements used in these analyses are the culture-derived CFU measurements.

Conditional on a participant being colonised with pneumococcus at a given visit, we compare carriage density between vaccination groups, inoculation dose and serotype (6B induced carriage or other serotype natural carriage). Specifically, we use generalised estimating equations (GEE) with an exchangeable correlation structure to fit a linear regression model, with Gaussian errors and log link function, to the data from visits C, E, F, G, restricted to data only from participants that were carriers at that visit, with carriage density as response (given the log link we have added 1 to each numeric desnity measurement) and vaccination status, inoculation dose and type of carriage as predictors.

simDatLongCarriersOnly <- simDat %>%
  dplyr::select(pid,VaccName,dose,densityVisitC,densityVisitE,densityVisitF,densityVisitG,carriageSerotypeVisitC,carriageSerotypeVisitE,carriageSerotypeVisitF,carriageSerotypeVisitG) %>%
  pivot_longer(cols=-c(pid,VaccName,dose),names_to=c(".value","Visit"),names_pattern="(.+)Visit(.+)") %>%
  mutate(
    serotype6B=ifelse(carriageSerotype=="6B",1,0),
      densityPlusOne=density+1
  ) %>%
  filter(!is.na(density) & !is.na(serotype6B))

geeModCarrDens<-geeglm(densityPlusOne~VaccName+serotype6B+factor(dose),id=factor(pid),corstr="exchangeable",data=simDatLongCarriersOnly,family=stats::gaussian("log"))

geeSum<-summary(geeModCarrDens)$coefficients
colnames(geeSum)[colnames(geeSum)=="Pr(>|W|)"]<-"p.value"

geeSum %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(geeModCarrDens)$coefficients)=="serotype6B"~"Serotype 6B",
      rownames(summary(geeModCarrDens)$coefficients)=="factor(dose)80000"~"Dose: CFU 80,000",
      rownames(summary(geeModCarrDens)$coefficients)=="factor(dose)160000"~"Dose: CFU 160,000"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fit to carriage density data. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.18: Results from the GEE model fit to carriage density data. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 3.091019 1.0941320 7.981124 0.0047
PCV-13 vaccine -2.806351 0.9877270 8.072536 0.0045
Serotype 6B 3.888376 1.0337286 14.148922 0.0002
Dose: CFU 80,000 5.365698 0.8335656 41.435531 0.0000
Dose: CFU 160,000 2.868503 0.9955516 8.302009 0.0040

We repeat this analysis, stratified by inoculation dose (note that any inoculation dose group with too few carriers for model fitting will skipped for this analysis).

geeModCarrDens20<-geeglm(densityPlusOne~VaccName+serotype6B,id=factor(pid),corstr="exchangeable",data=simDatLongCarriersOnly %>% dplyr::filter(dose==20000),family=stats::gaussian("log"))

geeSum20<-summary(geeModCarrDens20)$coefficients
colnames(geeSum20)[colnames(geeSum20)=="Pr(>|W|)"]<-"p.value"

geeSum20 %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens20)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens20)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(geeModCarrDens20)$coefficients)=="serotype6B"~"Serotype 6B"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fit to carriage density data for the CFU 20,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.19: Results from the GEE model fit to carriage density data for the CFU 20,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 7.2730891 0.3376032 464.1140816 0.0000
PCV-13 vaccine 1.5380909 0.7345339 4.3847045 0.0363
Serotype 6B -0.4265342 0.5404508 0.6228672 0.4300
geeModCarrDens80<-geeglm(densityPlusOne~VaccName+serotype6B,id=factor(pid),corstr="exchangeable",data=simDatLongCarriersOnly %>% dplyr::filter(dose==80000),family=stats::gaussian("log"))

geeSum80<-summary(geeModCarrDens80)$coefficients
colnames(geeSum80)[colnames(geeSum80)=="Pr(>|W|)"]<-"p.value"

geeSum80 %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens80)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens80)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(geeModCarrDens80)$coefficients)=="serotype6B"~"Serotype 6B"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fit to carriage density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.20: Results from the GEE model fit to carriage density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 8.352449 1.496112 31.167308 0.0000
PCV-13 vaccine -2.816034 0.994796 8.013233 0.0046
Serotype 6B 4.001771 1.278855 9.791788 0.0018
geeModCarrDens160<-geeglm(densityPlusOne~VaccName+serotype6B,id=factor(pid),corstr="exchangeable",data=simDatLongCarriersOnly %>% dplyr::filter(dose==160000),family=stats::gaussian("log"))

geeSum160<-summary(geeModCarrDens160)$coefficients
colnames(geeSum160)[colnames(geeSum160)=="Pr(>|W|)"]<-"p.value"

geeSum160 %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens160)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens160)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(geeModCarrDens160)$coefficients)=="serotype6B"~"Serotype 6B"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fit to carriage density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.21: Results from the GEE model fit to carriage density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 8.111231 0.3768946 463.162917 0.0000
PCV-13 vaccine -2.691829 1.2014159 5.020052 0.0251
Serotype 6B 1.737010 1.0258344 2.867150 0.0904

For graphical representation, on Figure 4.9 we show average carriage densities at each visit, stratified by serotype (6B or other) and inoculation dose group and on Figure 4.10 we show the underlying data as jitter and box plots.

simDatLongCarriersOnly %>%
  group_by(Visit,VaccName,dose,serotype6B) %>%
  summarise(
    densityMedian=median(density),
    densityQ25=quantile(density,probs=0.25),
    densityQ75=quantile(density,probs=0.75),
    .groups="drop"
  ) %>%
  complete(Visit,VaccName,dose,serotype6B) %>%
  mutate(
    serotype6B=case_when(
      serotype6B==0~"Not 6B",
      serotype6B==1~"6B"
    )
  ) %>%
  ggplot(mapping=aes(x=Visit,y=densityMedian,ymin=densityQ25,ymax=densityQ75,fill=VaccName)) +
  geom_bar(stat="identity",position=position_dodge()) +
  geom_errorbar(width=0.5,position=position_dodge(width=0.9),col="darkgrey",lwd=0.5) +
  theme_light() +
  ylab("Carriage density") +
  facet_wrap(~serotype6B+factor(paste(sep=" ","CFU",format(dose,big.mark=",",trim=TRUE)),levels=c("CFU 20,000","CFU 80,000","CFU 160,000"))) +
  scale_fill_manual(values=c("steelblue","orange")) +
  scale_y_continuous(trans="log10")
Median carriage densities at the different study visits, stratified by study arm, serotype and inoculation dose.

Figure 4.9: Median carriage densities at the different study visits, stratified by study arm, serotype and inoculation dose.

simDatLongCarriersOnly %>%
  mutate(
    serotype6B=case_when(
      serotype6B==0~"Not 6B",
      serotype6B==1~"6B")
    ) %>%
  ggplot(mapping=aes(x=paste(sep=" ",Visit,VaccName),y=density)) +
  geom_boxplot(mapping=aes(fill=VaccName),col="darkgrey",alpha=0.25) +
  geom_jitter(mapping=aes(col=VaccName,group=VaccName),height=0,width=0.1,alpha=0.75) +
  theme_light() +
  ylab("Carriage density") +
  xlab("Visit") + 
  facet_wrap(~serotype6B+factor(paste(sep=" ","CFU",format(dose,big.mark=",",trim=TRUE)),levels=c("CFU 20,000","CFU 80,000","CFU 160,000"))) +
  scale_color_manual(values=c("steelblue","orange"),name="Study arm") +
  scale_fill_manual(values=rep("white",2),name="Study arm") +
  scale_y_continuous(trans="log10") +
  scale_x_discrete(labels=c("Visit","C","Visit","E","Visit","F","Visit","G"))
Jitter and box plot of carriage density at the different study visits by serotype and inoculation group.

Figure 4.10: Jitter and box plot of carriage density at the different study visits by serotype and inoculation group.

4.2.3.3 Carriage density (LytA)

Unless stated otherwise, the density measurements used in these analyses are the LytA PCR-derived density measurements.

Conditional on a participant being colonised with pneumococcus at a given visit, we compare LytA PCR carriage density between vaccination groups, inoculation dose and serotype (6B induced carriage or other serotype natural carriage). Specifically, we use generalised estimating equations (GEE) with an exchangeable correlation structure to fit a linear regression model, with Gaussian errors and log link function, to the data from visits C, E, F, G, restricted to data only from participants that were carriers at that visit, with carriage density as response (given the log link we have added 1 to each numeric desnity measurement) and vaccination status, inoculation dose and type of carriage as predictors.

simDatLongLytACarriersOnly <- simDat %>%
  dplyr::select(pid,VaccName,dose,density6BLytAVisitE,density6BLytAVisitF,density6BLytAVisitG,carriage6BLytAVisitE,carriage6BLytAVisitF,carriage6BLytAVisitG) %>%
  tidyr::pivot_longer(cols=-c(pid,VaccName,dose),names_to=c(".value","Visit"),names_pattern="(.+)6BLytAVisit(.+)") %>%
  dplyr::mutate(
    serotype6B=carriage,
    densityPlusOne=density+1
  ) %>%
  dplyr::filter(!is.na(density) & !is.na(serotype6B)) %>%
  dplyr::mutate(doseFct=factor(dose,levels=c("160000","80000","20000")))

geeModCarrDens<-geeglm(densityPlusOne~VaccName+doseFct,id=factor(pid),corstr="exchangeable",data=simDatLongLytACarriersOnly,family=stats::gaussian("log"))

geeSum<-summary(geeModCarrDens)$coefficients
colnames(geeSum)[colnames(geeSum)=="Pr(>|W|)"]<-"p.value"

geeSum %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(geeModCarrDens)$coefficients)=="doseFct80000"~"Dose: CFU 80,000",
      rownames(summary(geeModCarrDens)$coefficients)=="doseFct20000"~"Dose: CFU 20,000"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fit to LytA PCR density data. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.22: Results from the GEE model fit to LytA PCR density data. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 0.8833416 0.2861391 9.530218 0.0020
PCV-13 vaccine -4.2215603 0.9116877 21.441431 0.0000
Dose: CFU 80,000 6.2873579 0.6826380 84.831163 0.0000
Dose: CFU 20,000 4.6792644 1.0006524 21.866976 0.0000

We repeat this analysis, stratified by inoculation dose (note that any inoculation dose group with too few carriers for model fitting will skipped for this analysis).

In particular this means skipping the entire CFU 20,000 inoculation group as all LytA PCR determined 6B carriers in this group are in the saline group.

geeModCarrDens80<-geeglm(densityPlusOne~VaccName,id=factor(pid),corstr="exchangeable",data=simDatLongLytACarriersOnly %>% dplyr::filter(dose==80000),family=stats::gaussian("log"))

geeSum80<-summary(geeModCarrDens80)$coefficients
colnames(geeSum80)[colnames(geeSum80)=="Pr(>|W|)"]<-"p.value"

geeSum80 %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens80)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens80)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fitted to LytA PCR density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.23: Results from the GEE model fitted to LytA PCR density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 7.160430 0.6205136 133.16054 0.0000
PCV-13 vaccine -4.215907 0.9119749 21.37058 0.0000
geeModCarrDens160<-geeglm(densityPlusOne~VaccName,id=factor(pid),corstr="exchangeable",data=simDatLongLytACarriersOnly %>% dplyr::filter(dose==160000),family=stats::gaussian("log"))

geeSum160<-summary(geeModCarrDens160)$coefficients
colnames(geeSum160)[colnames(geeSum160)=="Pr(>|W|)"]<-"p.value"

geeSum160 %>%
  mutate(
    parameter=case_when(
      rownames(summary(geeModCarrDens160)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(geeModCarrDens160)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine"
    )
  ) %>%
  as.data.frame() %>%
  dplyr::select(parameter,Estimate,Std.err,Wald,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Wald statistic","p-value"),escape=FALSE,caption="Results from the GEE model fitted to LytA PCR density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.") %>%
  kable_styling(full_width = FALSE)
Table 4.24: Results from the GEE model fitted to LytA PCR density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.
Estimate Std. error Wald statistic p-value
(Intercept) 0.8495388 0.2816220 9.099827 0.0026
PCV-13 vaccine -0.4353438 0.3060198 2.023792 0.1549

For graphical representation, on Figure 4.11 we show average carriage densities at each visit, study arm and inoculation dose group and on Figure 4.12 we show the underlying data as jitter and box plots.

simDatLongLytACarriersOnly %>%
  group_by(Visit,VaccName,dose) %>%
  summarise(
    densityMedian=median(density),
    densityQ25=quantile(density,probs=0.25),
    densityQ75=quantile(density,probs=0.75),
    .groups="drop"
  ) %>%
  complete(Visit,VaccName,dose) %>%
  ggplot(mapping=aes(x=Visit,y=densityMedian,ymin=densityQ25,ymax=densityQ75,fill=VaccName)) +
  geom_bar(stat="identity",position=position_dodge()) +
  geom_errorbar(width=0.5,position=position_dodge(width=0.9),col="darkgrey",lwd=0.5) +
  theme_light() +
  ylab("Carriage density") +
  facet_wrap(~factor(paste(sep=" ","CFU",format(dose,big.mark=",",trim=TRUE)),levels=c("CFU 20,000","CFU 80,000","CFU 160,000"))) +
  scale_fill_manual(values=c("steelblue","orange")) +
  scale_y_continuous(trans="log10")
Median LytA carriage densities at the different study visits, stratified by study arm and inoculation dose.

Figure 4.11: Median LytA carriage densities at the different study visits, stratified by study arm and inoculation dose.

simDatLongLytACarriersOnly %>%
  ggplot(mapping=aes(x=paste(sep=" ",Visit,VaccName),y=density)) +
  geom_boxplot(mapping=aes(fill=VaccName),col="darkgrey",alpha=0.25) +
  geom_jitter(mapping=aes(col=VaccName,group=VaccName),height=0,width=0.1,alpha=0.75) +
  theme_light() +
  ylab("Carriage density") +
  xlab("Visit") + 
  facet_wrap(~factor(paste(sep=" ","CFU",format(dose,big.mark=",",trim=TRUE)),levels=c("CFU 20,000","CFU 80,000","CFU 160,000"))) +
  scale_color_manual(values=c("steelblue","orange"),name="Study arm") +
  scale_fill_manual(values=rep("white",2),name="Study arm") +
  scale_y_continuous(trans="log10") +
  scale_x_discrete(labels=c("Visit","C","Visit","E","Visit","F","Visit","G"))
Jitter and box plot of LytA carriage density at the different study visits by serotype and inoculation group.

Figure 4.12: Jitter and box plot of LytA carriage density at the different study visits by serotype and inoculation group.

4.2.3.4 Carriage duration

Given that there are only a small number of fixed visits where carriage is assessed, we have not performed a formal analysis of duration data (both start and end times of carriage events are all either left, interval or right censored). However we do provide a visual summary of the duration data, by carriage type (6B or other serotype), study arm and visit of first recorded carriage. Specifically we show mean carriage durations as well as the standard error for this mean duration in Figure 4.13.

Given the censored nature of carriage start and end times, we need to make a few pragmatic decisions for the visual summaries. If carriage is only recorded at the last visit (visit G), then no duration can be inferred. Where carriage ceases between any 2 visits, the duration is computed as if it occurred at the mid-point of the interval given by the two visits. Where carriage is still ongoing at the last visit (visit G), carriage duration is recorded as if it lasted until visit G. The start time of carriage is taken as the visit at which carriage was first detected.

simDat<-simDat %>%
  mutate(
    visitFirstCarriage6B=case_when(
      carriageVisitE==1~"E",
      carriageVisitE==0 & carriageVisitF==1~"F",
      carriageVisitE==0 & carriageVisitF==0 & carriageVisitG==1~"G",
      TRUE~NA_character_
    ),
    visitFirstCarriageNot6B=case_when(
      carriageNot6BVisitC==1~"C",
      carriageNot6BVisitC==0 & carriageNot6BVisitE==1~"E",
      carriageNot6BVisitC==0 & carriageNot6BVisitE==0 & carriageNot6BVisitF==1~"F",
      carriageNot6BVisitC==0 & carriageNot6BVisitE==0 & carriageNot6BVisitF==0 & carriageNot6BVisitG==1~"G",
      TRUE~NA_character_
    ),
    durationFirstCarriage6B=case_when(
      carriageVisitE==1 & carriageVisitF==0~7/2,
      carriageVisitE==1 & carriageVisitF==1 & carriageVisitG==0~7+14/2,
      carriageVisitE==1 & carriageVisitF==1 & carriageVisitG==1~7+14,
      carriageVisitE==0 & carriageVisitF==1 & carriageVisitG==0~14/2,
      carriageVisitE==0 & carriageVisitF==1 & carriageVisitG==1~14
    ),
    durationFirstCarriageNot6B=case_when(
      carriageNot6BVisitC==1 & carriageNot6BVisitE==0~9/2,
      carriageNot6BVisitC==1 & carriageNot6BVisitE==1 & carriageNot6BVisitF==0~9+7/2,
      carriageNot6BVisitC==1 & carriageNot6BVisitE==1 & carriageNot6BVisitF==1 & carriageNot6BVisitG==0~9+7+14/2,
      carriageNot6BVisitC==1 & carriageNot6BVisitE==1 & carriageNot6BVisitF==1 & carriageNot6BVisitG==1~9+7+14,
      carriageNot6BVisitC==0 & carriageNot6BVisitE==1 & carriageNot6BVisitF==0~7/2,
      carriageNot6BVisitC==0 & carriageNot6BVisitE==1 & carriageNot6BVisitF==1 & carriageNot6BVisitG==0~7+14/2,
      carriageNot6BVisitC==0 & carriageNot6BVisitE==1 & carriageNot6BVisitF==1 & carriageNot6BVisitG==1~7+14,
      carriageNot6BVisitC==0 & carriageNot6BVisitE==0 & carriageNot6BVisitF==1 & carriageNot6BVisitG==0~14/2,
      carriageNot6BVisitC==0 & carriageNot6BVisitE==0 & carriageNot6BVisitF==1 & carriageNot6BVisitG==1~14,
    )
  )

durationDat6B<-simDat %>%
  filter(carriage==1 & visitFirstCarriage6B!="G") %>%
  group_by(VaccName,visitFirstCarriage6B) %>%
  summarise(
    dur6BMean=mean(durationFirstCarriage6B,na.rm=T),
    dur6BSE=sd(durationFirstCarriage6B,na.rm=T)/sum(!is.na(durationFirstCarriage6B)),
    .groups="drop"
  ) %>%
  tidyr::complete(VaccName,visitFirstCarriage6B)

durationDatNot6B<-simDat %>%
  filter(carriageNot6B==1 & visitFirstCarriageNot6B!="G") %>%
  group_by(VaccName,visitFirstCarriageNot6B) %>%
  summarise(
    durNot6BMean=mean(durationFirstCarriageNot6B,na.rm=T),
    durNot6BSE=sd(durationFirstCarriageNot6B,na.rm=T)/sum(!is.na(durationFirstCarriageNot6B)),
    .groups="drop"
  ) %>%
  tidyr::complete(VaccName,visitFirstCarriageNot6B)

durationDat<-data.frame(
  VaccName=c(durationDat6B$VaccName,durationDatNot6B$VaccName),
  visitFirstCarriage=c(durationDat6B$visitFirstCarriage6B,durationDatNot6B$visitFirstCarriageNot6B),
  durMean=c(durationDat6B$dur6BMean,durationDatNot6B$durNot6BMean),
  durSE=c(durationDat6B$dur6BSE,durationDatNot6B$durNot6BSE),
  type=c(rep("6B carriage",nrow(durationDat6B)),rep("Non-6B carriage",nrow(durationDatNot6B)))
)

durationDat %>%
  ggplot(mapping=aes(x=factor(visitFirstCarriage,levels=sort(decreasing=T,unique(visitFirstCarriage))),y=durMean,ymin=durMean-durSE,ymax=durMean+durSE,fill=VaccName)) +
  geom_bar(stat="identity",position=position_dodge()) +
  geom_errorbar(width=0.5,position=position_dodge(width=0.9),col="darkgrey",lwd=0.5) +
  scale_fill_manual(values=c("steelblue","orange")) +
  coord_flip() +
  xlab("Visit of first carriage") +
  ylab("Mean carriage duration (days)") +
  theme_light() +
  labs(caption="Error bars indicate the standard error of the sample mean of carriage duration.\nAbsence of error bars means a single observation.\nAbsence of colour bar means no recorded first carriage at that visit for that vaccination group.") +
  theme(axis.text.y=element_text(size=20)) +
  facet_wrap("type",nrow = 2)
Mean carriage duration by serotype, study arm and visit of first carriage.

Figure 4.13: Mean carriage duration by serotype, study arm and visit of first carriage.

4.2.3.5 Total carriage density during study

We define the total experimental carriage density during the study as the area under the time-density curve (tdAUC) for 6B colonisation. We use the trapezoidal rule to calculate the tdAUC using the density measurements at visits E, F, G and assuming a visit C density of 0 CFU/ml. Likewise, density at any visit where no carriage was detected is taken to be 0 CFU/ml.

To calculate tdAUC, we use the actual density measurements, but for statistical tests and models we use \(log_{10}(\mbox{tdAUC + 1})\) as the tdAUC distribution is severely skew (we restrict comparisons to carriers, so all tdAUCs used in analyses will be positive).

As per the main analysis of binary carriage status, we compare \(log_{10}(\mbox{tdAUC + 1})\) measurements for experimental carriers using a generalised linear model between PCV-13 and saline randomised participants while accounting for inoculation dose (Figure 4.14).

We also compare the tdAUC for experimental carriers between PCV-13 vaccinated and saline randomised participants using the two-sample Mann-Whitney-Wilcoxon test, stratified by inoculation dose - see Table 4.25 (for doses where there are no carriers in one of the 2 groups, this analysis has been skipped).

We summarise these results in 2 types of graph: 1) box and jitter plots showing the carriage densities in the different study arms, stratified by inoculation dose (Figure 4.14) and 2) line graphs showing the individual and average colonisation density profiles over time in the two groups, again stratified by inoculation dose (Figure 4.15).

options(scipen=99)

simDat<-simDat %>%
  dplyr::mutate(
    density6BVisitE=case_when(carriageVisitE==1~densityVisitE,TRUE~0),
    density6BVisitF=case_when(carriageVisitF==1~densityVisitF,TRUE~0),
    density6BVisitG=case_when(carriageVisitG==1~densityVisitG,TRUE~0),
    densityNot6BVisitC=case_when(carriageNot6BVisitC==1~densityVisitC,TRUE~0),
    densityNot6BVisitE=case_when(carriageNot6BVisitE==1~densityVisitE,TRUE~0),
    densityNot6BVisitF=case_when(carriageNot6BVisitF==1~densityVisitF,TRUE~0),
    densityNot6BVisitG=case_when(carriageNot6BVisitG==1~densityVisitG,TRUE~0)
  )

tdAUCVect<-function(dateVec,densVec){
  n<-length(dateVec)
  if(length(densVec)!=n){stop("dateVec and densVec need to be of the same length.")}
  
  densVec[is.na(densVec)]<-0
  #densVec<-log10(densVec+1)
  
  midPoints<-(densVec[1:(n-1)]+densVec[2:n])/2
  widths<-as.numeric(ymd(dateVec[2:n])-ymd(dateVec[1:(n-1)]))
  
  auc<-sum(midPoints*widths)
  return(auc)
}

tdAUCMat<-function(dateCols,densCols){
  n<-ncol(dateCols)
  if(ncol(densCols)!=n){stop("dateCols and densCols need to have the same number of columns.")}
  
  densCols[is.na(densCols)]<-0
  #densCols<-log10(densCols+1)
  
  for(j in 1:n){dateCols[,j]<-ymd(dateCols[,j])}
  
  midPoints<-(densCols[,1:(n-1)]+densCols[,2:n])/2
  widths<-sapply(FUN=as.numeric,X=dateCols[,2:n]-dateCols[,1:(n-1)])
  
  auc<-rowSums(midPoints*widths)
  return(auc)
}

simDat<-simDat %>%
  dplyr::mutate(tdAUC=tdAUCMat(dateCols=simDat %>% dplyr::select(visitC_date,visitE_date,visitF_date,visitG_date),densCols=cbind(0,simDat %>% dplyr::select(density6BVisitE,density6BVisitF,density6BVisitG))))

tdAUCmod<-glm(log10(tdAUC+1)~VaccName+doseGroup,data=simDat %>% filter(carriage==1))

tdAUCwilcox20<-try(silent=TRUE,wilcox.test(simDat$tdAUC[simDat$doseGroup=="CFU 20000" & simDat$carriage==1 & simDat$VaccName=="PCV-13"],simDat$tdAUC[simDat$doseGroup=="CFU 20000" & simDat$carriage==1 & simDat$VaccName=="Saline"]))
tdAUCwilcox80<-try(silent=TRUE,wilcox.test(simDat$tdAUC[simDat$doseGroup=="CFU 80000" & simDat$carriage==1 & simDat$VaccName=="PCV-13"],simDat$tdAUC[simDat$doseGroup=="CFU 80000" & simDat$carriage==1 & simDat$VaccName=="Saline"]))
tdAUCwilcox160<-try(silent=TRUE,wilcox.test(simDat$tdAUC[simDat$doseGroup=="CFU 160000" & simDat$carriage==1 & simDat$VaccName=="PCV-13"],simDat$tdAUC[simDat$doseGroup=="CFU 160000" & simDat$carriage==1 & simDat$VaccName=="Saline"]))

simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 20000","CFU 80000","CFU 160000"))

wilcoxResDf<-simDat %>%
  dplyr::select(carriage,doseGroup,VaccName,tdAUC) %>%
  dplyr::filter(carriage==1) %>%
  dplyr::group_by(doseGroup,VaccName) %>%
  dplyr::summarise(
    median=format(nsmall=2,round(digits=2,(median(tdAUC)))),
    IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(tdAUC,prob=0.25))),",",format(nsmall=2,round(digits=2,quantile(tdAUC,prob=0.75))),")"),
    .groups="drop"
  )
  wilcoxResDf<-tidyr::complete(data=wilcoxResDf,doseGroup,VaccName,fill=list(median="-",IQR="-")) %>%
    tidyr::pivot_wider(id_cols=doseGroup,names_from=VaccName,values_from=c(median,IQR)) %>%
    dplyr::mutate(p.value=NA)
  
  for(j in 1:nrow(wilcoxResDf)){
    if(
      wilcoxResDf$doseGroup[j]=="CFU 20000" & class(tdAUCwilcox20)!="try-error"){wilcoxResDf$p.value[j]<-format(nsmall=4,round(digits=4,tdAUCwilcox20$p.value))
    }else if(
      wilcoxResDf$doseGroup[j]=="CFU 80000" & class(tdAUCwilcox80)!="try-error"){wilcoxResDf$p.value[j]<-format(nsmall=4,round(digits=4,tdAUCwilcox80$p.value))
    }else if(
      wilcoxResDf$doseGroup[j]=="CFU 160000" & class(tdAUCwilcox160)!="try-error"){wilcoxResDf$p.value[j]<-format(nsmall=4,round(digits=4,tdAUCwilcox160$p.value))
    }else{
      wilcoxResDf$p.value[j]<-"-"
    }
    if(wilcoxResDf$p.value[j]=="0.0000"){wilcoxResDf$p.value[j]<-"<0.0001"}
  }
 
wilcoxResDf<-wilcoxResDf[,c("doseGroup","median_PCV-13","IQR_PCV-13","median_Saline","IQR_Saline","p.value")]
   
wilcoxResDf %>%
  knitr::kable(col.names=c("Dose","Median","IQR","Median","IQR","p.value"),caption="Mann-Whitney-Wilcoxon test result for comparing tdAUC between study arms by inoculation group.") %>%
  kableExtra::kable_styling(full_width=FALSE) %>%
  kableExtra::add_header_above(c(" "=1,"PCV-13"=2,"Saline"=2," "=1))
Table 4.25: Mann-Whitney-Wilcoxon test result for comparing tdAUC between study arms by inoculation group.
PCV-13
Saline
Dose Median IQR Median IQR p.value
CFU 20000
12380.19 (6360.20,18400.19)
CFU 80000 22816.47 (5329.53,56105.25) 124976.60 (3885.48,731473.93) 0.3845
CFU 160000 860.35 (440.43,1493.27) 251.77 (15.54,2189.90) 0.4769
simDat %>%
  filter(carriage==1) %>%
  ggplot(mapping=aes(x=VaccName,y=tdAUC,col=VaccName)) +
  geom_boxplot(alpha=0.5) +
  geom_jitter(height=0,width=0.25,alpha=0.5) +
  theme_light() +
  scale_color_manual(values=c("steelblue","orange")) +
  guides(color="none") +
  facet_wrap(~doseGroup,nrow=1) +
  xlab("") +
  ylab("total density AUC (CFU days/ml)") +
  scale_y_continuous(trans = "log10") +
  labs(caption=paste(sep="","Conditional on carriage and on inoculation dose, the effect of PCV-13\nvaccination on log10(tdAUC) is ",round(digits=2,summary(tdAUCmod)$coefficients["VaccNamePCV-13","Estimate"])," (p=",round(digits=4,summary(tdAUCmod)$coefficients["VaccNamePCV-13","Pr(>|t|)"]),")."))
Total carriage density (tdAUC) by study arm and inoculation dose.

Figure 4.14: Total carriage density (tdAUC) by study arm and inoculation dose.

simDatDensLong<-simDat %>%
  dplyr::mutate(density6BVisitC=0) %>%
  dplyr::select(c(pid,VaccName,doseGroup,carriage,visitC_date,visitE_date,visitF_date,visitG_date,density6BVisitC,density6BVisitE,density6BVisitF,density6BVisitG)) %>%
  dplyr::mutate(
    timeSinceVisitC_VisitC=0,
    timeSinceVisitC_VisitE=as.numeric(ymd(visitE_date)-ymd(visitC_date)),
    timeSinceVisitC_VisitF=as.numeric(ymd(visitF_date)-ymd(visitC_date)),
    timeSinceVisitC_VisitG=as.numeric(ymd(visitG_date)-ymd(visitC_date))
    ) %>%
  dplyr::select(!starts_with("visit"))

colnames(simDatDensLong)<-gsub(pattern="density6B",replacement="density6B_",colnames(simDatDensLong))

simDatDensLong<-simDatDensLong %>%
  tidyr::pivot_longer(cols=contains("Visit"),names_pattern="(.*)_Visit(.)",names_to=c("measurement","visit"),values_to="value") %>%
  tidyr::pivot_wider(id_cols=c(pid,VaccName,doseGroup,carriage,visit),names_from="measurement",values_from="value")

simDatDensLongAvg<-simDatDensLong %>%
  filter(carriage==1) %>%
  dplyr::group_by(VaccName,doseGroup,timeSinceVisitC) %>%
  dplyr::summarise(
    density6B=median(density6B)
  )

simDatDensLong %>%
  ggplot(mapping=aes(x=timeSinceVisitC,y=density6B,lty=pid,col=VaccName)) +
  geom_line(alpha=0.65) +
  theme_light() +
  geom_line(data=simDatDensLongAvg,mapping=aes(x=timeSinceVisitC,y=density6B,col=VaccName),lwd=1.75,lty=1) +
  xlab("Days since inoculation visit.") +
  ylab("Carriage density (CFU/ml)") +
  scale_y_continuous(trans = "log10") +
  scale_color_manual(values=c("steelblue","orange")) +
  scale_linetype_manual(values=sample(2:6,replace=TRUE,size=length(unique(simDatDensLong$pid)))) +
  guides(linetype="none",color="none") +
  facet_wrap(~VaccName+doseGroup,nrow=2) +
  labs(title="Density time profiles by study arm and inoculation dose.",caption="Dashed and dotted lines show individual profiles.\nThe thick solid lines show the median profiles among experimental carriers in each group.")
Density time profiles by study arm and inoculation dose.

Figure 4.15: Density time profiles by study arm and inoculation dose.

4.2.3.6 Effect of PCV-13 vaccine on natural, vaccine-type carriage

Given that the PCV-13 vaccine is meant to protect against 13 serotypes of S. pneumoniae, we will investigate whether we observe such a protective effect against vaccine-type serotypes.

However, as we are likely to observe much lower rates of natural carriage than experimental carriage we are almost surely not powered for this analysis and hence will primarily focus largely on a descriptive analysis, summarising vaccine-type carriage events in a 2x2 table and performing a simple Fisher exact test.

We will however attempt to fit the same model as for the primary analysis, just with natural, vaccine-type carriage as endpoint. If this model converges, we will report the results, but will be careful with over-interpreting the result given the expected low power for this analysis.

simDat<-simDat %>% 
  dplyr::mutate(carriageSerotypeNot6B=NA)

for(pid in unique(datCarriageSerotype$pid)){
  simDat$carriageSerotypeNot6B[simDat$pid==pid]<-paste(collapse=";",unique(datCarriageSerotype$serotype[datCarriageSerotype$pid==pid & datCarriageSerotype$carriage=="Natural"]))
}

simDat$carriageSerotypeNot6B[simDat$carriageSerotypeNot6B=="NA" | simDat$carriageSerotypeNot6B==""]<-NA

simDat<-simDat %>%
  dplyr::mutate(
    carriageNot6BVaccineType=case_when(
      is.na(carriageSerotypeNot6B)~0,
      !is.na(carriageSerotypeNot6B) & carriageSerotypeNot6B=="NVT"~0,
      TRUE~1
    ))

tabVT<-table(simDat$VaccName,simDat$carriageNot6BVaccineType)

pFishVT<-fisher.test(tabVT)$p.value
pFishVT<-ifelse(pFishVT>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishVT)))," < 0.001")

tabVT %>%
  kable(caption=paste(sep="","Number of participants colonised with vaccine-type S. Pneumoniae by study arm. Exact Fisher test p-value ",pFishVT,"."),col.names=c("Not colonised","Colonised"),row.names=T) %>%
  kable_styling(full_width = F)
Table 4.26: Number of participants colonised with vaccine-type S. Pneumoniae by study arm. Exact Fisher test p-value = 0.732.
Not colonised Colonised
Saline 85 21
PCV-13 76 22
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))

modNot6B<-logbin::logbin(carriageNot6BVaccineType~VaccName+doseGroup,data=simDat)

pGlm<-summary(modNot6B)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(modNot6B)$coefficients["VaccNamePCV-13","Estimate"],confint(modNot6B)["VaccNamePCV-13",]))

modRes<-summary(modNot6B)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(modNot6B)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(modNot6B)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(modNot6B)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(modNot6B)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit for natural, vaccinet-type carriage. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.27: Summary of the log-binomial regression model fit for natural, vaccinet-type carriage. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 21%, 95% CI (13.1%, 34.0%). There is no statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR 1.10, 95% CI (0.65, 1.87), p = 0.712).
Estimate Std. error Z statistic p-value
(Intercept) -1.554 0.242 -6.420 0.000
PCV-13 vaccine 0.100 0.270 0.370 0.712
Dose: CFU 80,000 -0.310 0.330 -0.939 0.347
Dose: CFU 20,000 0.210 0.324 0.648 0.517

4.2.3.7 Sub-group analyses

We repeat the above analyses stratified by sex.

4.2.3.8 Inoculum densities

We compute the average before and after inoculation densities of the inocula that were prepared, summarise these by their medians, inter-quartile ranges and ranges per target dose in Table 4.28 and visualise them on boxplots on Figure 4.16.

inoDat<-datInoculum %>%
  dplyr::select(vial,cfu_before,cfu_after,average_cfu) %>%
  dplyr::mutate(
    dose=as.integer(gsub(pattern="6B-F[0-9]+-",replacement="",vial))
  )
colnames(inoDat)<-c("vial","doseConcPre","doseConcPost","doseConcAvg","dose")

inoDatLong<-inoDat %>%
  tidyr::pivot_longer(cols=contains("doseConc"),names_to="type",values_to="doseConc") %>%
  dplyr::mutate(type=gsub(pattern="doseConc",replacement="",type)) %>%
  dplyr::mutate(
    doseGroup=factor(levels=c("CFU 20,000","CFU 80,000","CFU 160,000"),case_when(
      dose==20000~"CFU 20,000",
      dose==80000~"CFU 80,000",
      dose==160000~"CFU 160,000",
      TRUE~NA_character_
    ))
  ) %>%
  dplyr::mutate(
    type=factor(levels=c("Before","After","Average"),case_when(
      type=="Pre"~"Before",
      type=="Post"~"After",
      type=="Avg"~"Average"
    ))
  )

inoDatSumPre<-inoDatLong %>%
  dplyr::filter(type=="Before") %>%
  dplyr::group_by(doseGroup) %>%
  dplyr::summarise(
    median=format(nsmall=0,big.mark=",",trim=TRUE,round(median(doseConc))),
    iqr=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0.25,0.75))))),")"),
    range=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0,1))))),")"),
  )

inoDatSumPost<-inoDatLong %>%
  dplyr::filter(type=="After") %>%
  dplyr::group_by(doseGroup) %>%
  dplyr::summarise(
    median=format(nsmall=0,big.mark=",",trim=TRUE,round(median(doseConc))),
    iqr=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0.25,0.75))))),")"),
    range=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0,1))))),")"),
  )

inoDatSumAvg<-inoDatLong %>%
  dplyr::filter(type=="Average") %>%
  dplyr::group_by(doseGroup) %>%
  dplyr::summarise(
    median=format(nsmall=0,big.mark=",",trim=TRUE,round(median(doseConc))),
    iqr=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0.25,0.75))))),")"),
    range=paste(sep="","(",paste(collapse=", ",format(nsmall=0,big.mark=",",trim=TRUE,round(quantile(doseConc,prob=c(0,1))))),")"),
  )

rbind(inoDatSumPre,inoDatSumPost,inoDatSumAvg) %>%
  kable(row.names=FALSE,col.names=c("","Median","IQR","Range"),caption="Medians, inter-quartile ranges (IQRs) and ranges for the inoculum dose measurements before and after inoculation. Also shown are the summaries for the average measurements.") %>%
  kable_styling(full_width = FALSE) %>%
  pack_rows(group_label="Before inoculation",1,3) %>%
  pack_rows(group_label="After inoculation",4,6) %>%
  pack_rows(group_label="Average",7,9)
Table 4.28: Medians, inter-quartile ranges (IQRs) and ranges for the inoculum dose measurements before and after inoculation. Also shown are the summaries for the average measurements.
Median IQR Range
Before inoculation
CFU 20,000 20,334 (17,250, 21,333) (12,667, 24,333)
CFU 80,000 70,000 (63,333, 86,667) (43,333, 123,333)
CFU 160,000 153,333 (133,333, 190,000) (76,667, 226,667)
After inoculation
CFU 20,000 20,500 (17,500, 21,333) (12,000, 22,333)
CFU 80,000 71,666 (60,000, 85,834) (46,667, 113,333)
CFU 160,000 150,000 (130,000, 170,000) (53,333, 270,000)
Average
CFU 20,000 19,250 (17,750, 21,333) (12,334, 22,833)
CFU 80,000 74,167 (63,333, 79,583) (51,666, 105,000)
CFU 160,000 155,000 (133,334, 165,000) (81,666, 248,334)
inoDatLong %>%
  ggplot(mapping=aes(x=type,y=doseConc,col=doseGroup)) +
  geom_boxplot(fill="white",alpha=0.5) +
  geom_jitter(height=0,alpha=0.5) +
  geom_hline(mapping=aes(yintercept=ifelse(dose==20000,0.5*20000,NA)),col=blueCols[2],lty=2) +
  geom_hline(mapping=aes(yintercept=ifelse(dose==20000,2*20000,NA)),col=blueCols[2],lty=2) +
  geom_hline(mapping=aes(yintercept=ifelse(dose==80000,0.5*80000,NA)),col=blueCols[3],lty=2) +
  geom_hline(mapping=aes(yintercept=ifelse(dose==80000,2*80000,NA)),col=blueCols[3],lty=2) +
  geom_hline(mapping=aes(yintercept=ifelse(dose==160000,0.5*160000,NA)),col=blueCols[4],lty=2) +
  geom_hline(mapping=aes(yintercept=ifelse(dose==160000,2*160000,NA)),col=blueCols[4],lty=2) +
  theme_minimal() +
  theme(legend.position = "none") +
  scale_color_manual(values=blueCols[2:4]) +
  xlab("Measurement type") +
  ylab("Dose (CFU)") +
  ylim(c(0,2*160000)) +
  labs(caption="Dashed lines indicate tolerable ranges.") +
  facet_wrap(~doseGroup)
Boxplot, by target dose, of before and after inoculation dose measurements of the study inocula as well as the average.

Figure 4.16: Boxplot, by target dose, of before and after inoculation dose measurements of the study inocula as well as the average.

4.2.3.9 Comparison of culture and lytA/6B derived density measurements

We have stated that the culture derived density measurements are used as default in the above listed main analysis investigations. However, we also measure densities derived from lytA/6B multiplex qPCR. We compare the two sets of density measurements, and specifically:

  • Produce scatter plots of culture (x-axis) and lytA/6B multiplex qPCR (y-axis) measurements, stratified by study arm, dose group and 6B and not-6B serotypes.
  • Compute differences in log10(density+1) measurements between culture and lytA/6B multiples qPCR results and summarise these as histograms, stratified by 6B and not-6B serotypes.

[Waiting for lytA multiplex qPCR results to be finalised]

simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 20000","CFU 80000","CFU 160000"))

dens6B<-simDat %>%
  dplyr::select(c(PID,doseGroup,VaccName,density6BVisitE,density6BVisitF,density6BVisitG)) %>%
  tidyr::pivot_longer(cols=c(density6BVisitE,density6BVisitF,density6BVisitG),names_to="visit",values_to="density6B") %>%
  dplyr::mutate(visit=gsub(visit,pattern="density6B",replacement=""))
      
densNot6B<-simDat %>%
  dplyr::select(c(PID,doseGroup,VaccName,densityNot6BVisitC,densityNot6BVisitE,densityNot6BVisitF,densityNot6BVisitG)) %>%
  tidyr::pivot_longer(cols=c(densityNot6BVisitC,densityNot6BVisitE,densityNot6BVisitF,densityNot6BVisitG),names_to="visit",values_to="densityNot6B") %>%
  dplyr::mutate(visit=gsub(visit,pattern="densityNot6B",replacement=""))

densDatLong<-densNot6B %>%
  dplyr::mutate(density6B=dens6B$density6B[match(paste(sep="_",PID,visit),paste(sep="_",dens6B$PID,dens6B$visit))]) %>%
  tidyr::pivot_longer(cols=c(densityNot6B,density6B),values_to="density",names_to="serotype") %>%
  dplyr::mutate(serotype=gsub(serotype,pattern="density",replacement=""))

densDatLong<-densDatLong %>%
  dplyr::mutate(density_lytA=rnorm(nrow(densDatLong),mean=density,sd=density/10)) %>%
  dplyr::mutate(density_lytA=ifelse(density_lytA<0,0,density_lytA)) %>%
  dplyr::mutate(density_log10Diff=log10(density+1)-log10(density_lytA+1))

g1<-densDatLong %>%
  dplyr::filter(serotype=="6B") %>%
  ggplot(mapping=aes(x=log10(density+1),y=log10(density_lytA+1),col=VaccName)) +
  geom_point() +
  scale_color_manual(values=c("steelblue","orange")) +
  guides(color="none") +
  theme_light() +
  xlab("log10(culture density [CFU/ul] + 1)") +
  ylab("log10(lytA/6B density [copies/ml] + 1)") +
  facet_wrap(~doseGroup+VaccName,nrow=3) +
  labs(title="6B")

g2<-densDatLong %>%
  dplyr::filter(serotype=="Not6B") %>%
  ggplot(mapping=aes(x=log10(density+1),y=log10(density_lytA+1),col=VaccName)) +
  geom_point() +
  scale_color_manual(values=c("steelblue","orange")) +
  guides(color="none") +
  theme_light() +
  xlab("log10(culture density [CFU/ul]  + 1)") +
  ylab("log10(lytA/6B density [copies/ml] + 1)") +
  facet_wrap(~VaccName,nrow=1) +
  labs(title="Other serotypes")

grid.arrange(g1,g2,nrow=2,heights=c(3,1.3))
g1<-densDatLong %>%
  dplyr::filter(serotype=="6B") %>%
  dplyr::filter(density>0 | density_lytA>0) %>%
  ggplot(mapping=aes(x=density_log10Diff,fill=VaccName)) +
  geom_histogram(col="white",bins=20) +
  scale_fill_manual(values=c("steelblue","orange")) +
  guides(fill="none") +
  theme_light() +
  xlab("log10(culture density [CFU/ul] + 1) - log10(lytA/6B density [copies/ml] + 1)") +
  ylab("") +
  facet_wrap(~doseGroup+VaccName,nrow=3) +
  labs(title="6B")

g2<-densDatLong %>%
  dplyr::filter(serotype=="Not6B") %>%
  dplyr::filter(density>0 | density_lytA>0) %>%
  ggplot(mapping=aes(x=density_log10Diff,fill=VaccName)) +
  geom_histogram(col="white",bins=20) +
  scale_fill_manual(values=c("steelblue","orange")) +
  guides(fill="none") +
  theme_light() +
  xlab("log10(culture density [CFU/ul]+ 1) - log10(lytA/6B density [copies/ml] + 1)") +
  ylab("") +
  facet_wrap(~VaccName,nrow=1) +
  labs(title="Other serotypes")

grid.arrange(g1,g2,nrow=2,heights=c(3,1.3))

Finally, as a sensitivity analysis, we will repeat the above listed main analyses (investigations of differences in carriage density by inoculation group, study arms and serotype) using the lytA/6B density measurements.

4.2.4 Exploratory analyses

4.2.4.1 Dose-response curve and effective dose ED50

Using data from the saline arm only, we use the data from the different inoculation dose groups to estimate a dose-response curve, Figure 4.17. Specifically, we fit a logistic regression model (Table 4.29) with log dose as the predictor variable (equivalent to a 2-parameter log-logistic dose-response model Ritz et al. (2015)). For this we use the actual dose of inoculum that was administered, defined as the average of the pre- and post-administration dose measurements, rather than the specified target dose.

Using the same notation as above, writing \(Y\) for the carriage status (i.e. the response) and \(X_{conc}\) for the actual inoculation dose measured in CFU, we fit the following model:

\[ \mbox{logit}\left(E[Y|X_{conc}]\right) = \beta_0 + \beta_1\cdot\log(X_{conc}) \]

where \(Y|X_{conc}\) follows a Bernoulli distribution with parameter \(p=E[Y|X_{conc}]\).

From this model we derive an estimate of the effective dose \(ED50\), the dose at which on average 50% of participants become colonised with serotype 6B:

\[ ED50 = \exp\left(-\beta_0/\beta_1\right) \]

To derive a 95% confidence intervals for \(ED50\), we use bootstrap resampling of the data with B=10,000 samples. For each bootstrap sample, we compute an estimate of \(ED50\), thus building up an empirical distribution of \(ED50\) values. We use the percentile method to summarise that empiral distribution as a 95% confidence interval.

simDat<-simDat %>%
  dplyr::mutate(
    doseConcAvg=datInoculumDose$average_cfu[match(pid,datInoculumDose$pid)]
  )
### fit DRC model
#modDrc<-drc::drm(carriage~log(doseConcAvg),data=simDat %>% dplyr::filter(VaccName=="Saline"),fct=LL.2(),type="binomial")
modDrcGlm<-  glm(carriage~log(doseConcAvg),data=simDat %>% dplyr::filter(VaccName=="Saline"),family=binomial)

edFun<-function(modGlm,prob=0.5){
  cfs<-coef(summary(modGlm))[,"Estimate"]
  ed<-exp((log(prob/(1-prob))-cfs[1])/cfs[2])
  return(ed)
}

ed50<-edFun(modDrcGlm)
  
B<-1e4
ed50Vect<-rep(NA,B)
for(b in 1:B){
  simDatBS<-simDat[sample(1:nrow(simDat),size=nrow(simDat),replace=TRUE),]
  modDrcGlmBS<-  glm(carriage~log(doseConcAvg),data=simDatBS %>% dplyr::filter(VaccName=="Saline"),family=binomial)
  ed50Vect[b]<-edFun(modDrcGlmBS)
}

ed50CI<-quantile(ed50Vect,probs=c(0.025,0.975))
ed50Mod<-format(nsmall=0,trim=TRUE,big.mark=",",round(digits=0,c(ed50,ed50CI)))

modDrcSum<-summary(modDrcGlm)$coefficients %>%
  as.data.frame()
modDrcSum<-modDrcSum %>%
  mutate(parameter=rownames(modDrcSum))
colnames(modDrcSum)<-c("Estimate","Std.Error","z.value","p.value","parameter")
  
modDrcSum %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=T),cell_spec(format(nsmall=4,round(digits=4,p.value)),bold=F))) %>%
  dplyr::select(!parameter) %>%
  kable(escape=FALSE,col.names=c("Estimate","Std. error","z value","p value"),caption=paste(sep="","Summary of the dose-response model fitted to data from participants randomised to saline. ED50, the minimum dose that will on average result in 50% carriage is estimated to be ",ed50Mod[1]," with 95% CI (",paste(collapse=", ",ed50Mod[2:3]),").")) %>%
  kableExtra::kable_styling(full_width = FALSE)
Table 4.29: Summary of the dose-response model fitted to data from participants randomised to saline. ED50, the minimum dose that will on average result in 50% carriage is estimated to be 326,311 with 95% CI (150,084, 19,659,046).
Estimate Std. error z value p value
(Intercept) -8.9339455 3.8874954 -2.298124 0.0216
log(doseConcAvg) 0.7037038 0.3384935 2.078928 0.0376
dfPred<-data.frame(
  logDoseConcAvg=seq(log(1),log(400000),length=2000)
) %>%
  mutate(
    doseConcAvg=exp(logDoseConcAvg)
  )

tmpPred<-as.data.frame(predict(modDrcGlm,newdata=dfPred,interval="link",se.fit=TRUE))

invLogit<-function(x){
  res<-exp(x)/(1+exp(x))
  return(res)
}

dfPred<-dfPred %>%
  dplyr::mutate(
    y=invLogit(tmpPred$fit),
    yLow=invLogit(tmpPred$fit-qnorm(0.975)*tmpPred$se.fit),
    yUpp=invLogit(tmpPred$fit+qnorm(0.975)*tmpPred$se.fit),
    )

tmpDat<-simDat %>%
  dplyr::filter(VaccName=="Saline") %>%
  dplyr::group_by(dose) %>%
  dplyr::summarise(
    n=n(),
    k=sum(carriage),
    propCarriage=mean(carriage)
    ) %>%
  mutate(
    propCarriageLow=NA,
    propCarriageUpp=NA
  )

for(j in 1:nrow(tmpDat)){
  tmp<-binom.test(x=tmpDat$k[j],n=tmpDat$n[j])$conf.int
  tmpDat$propCarriageLow[j]<-tmp[1]
  tmpDat$propCarriageUpp[j]<-tmp[2]
}

g<-ggplot() +
  geom_segment(mapping=aes(x=ed50,xend=ed50,y=0,yend=0.5),lty=2,lwd=0.5,col="steelblue") +
  geom_segment(mapping=aes(x=0,xend=ed50,y=0.5,yend=0.5),lty=2,lwd=0.5,col="steelblue") +
  geom_ribbon(mapping=aes(xmin=ed50CI[1],xmax=ed50CI[2],y=seq(0,1,length=100)),fill="steelblue",alpha=0.25) +
  geom_line(data=dfPred,mapping=aes(x=doseConcAvg,y=y),col="orange",lwd=1) +
  geom_ribbon(data=dfPred,mapping=aes(x=doseConcAvg,ymin=yLow,ymax=yUpp),fill="orange",alpha=0.25) +
  geom_point(data=simDat,mapping=aes(x=doseConcAvg,y=carriage),col="darkgrey",alpha=0.5,size=2.5) +
  geom_point(data=tmpDat,mapping=aes(x=dose,y=propCarriage),col="black",size=5) +
  geom_errorbar(dat=tmpDat,mapping=aes(x=dose,ymin=propCarriageLow,ymax=propCarriageUpp),width=2000) +
  xlab("Dose (CFU)") +
  ylab("Probability of carriage") +
  labs(title="Dose-response curve estimated from the saline randomised study participants.",caption=paste(sep="","A 2-parameter log-logistic model was used for the dose-response curve estimation.\nGrey dots show the individual carriage data as a function of actual dose delivered.\nBlack dots with error bars show the estimated proportion of carriage by target dose and the associated 95% confidence intervals.\nDashed blue lines and band indicate the ED50 threshold dose ",ed50Mod[1]," CFU with the 95% CI (",paste(collapse=", ",ed50Mod[2:3]),").")) +
  theme_light() +
  theme(text=element_text(size=18)) +
  coord_cartesian(xlim=c(0,320000))

print(g)
Dose-response curve estimated from the data for participants randomised to the saline arm.

Figure 4.17: Dose-response curve estimated from the data for participants randomised to the saline arm.

pdf(width=12,height=8,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_doseResponseCurve.pdf"))
print(g)
dev.off()
## quartz_off_screen 
##                 2
png(width=12,height=8,units="in",res=450,file=paste(sep="","Results/",gsub(pattern="-",replacement="",Sys.Date()),"_doseResponseCurve.png"))
print(g)
dev.off()
## quartz_off_screen 
##                 2

4.2.4.2 Temporal dynamics of pneumococcal colonisation

The study did not run over long enough a period of time to properly investigate the presence or absence of seasonal effects. Using flexible regression models, we can however investigate whether there seems to be evidence for temporal variation of colonisation probabilities (for both experimental and natural carriage). Specifically we can investigate:

  1. Whether there is temporal variation in the probability of natural colonisation, i.e. carriage of vaccine-type serotypes other than the experimental 6B strain.

  2. Whether there is temporal variation in the probability of experimental colonisation by serotype 6B.

The first would most likely point to variation in pneumoccocal exposure over time, whereas the second could point to variations in susceptivility of colonisation over time.

For these investigations, we refit the model from the main outcome analysis, but include restricted cubic spline terms for time of Visit C for each participant since Visit C of the first study participant, see Tables @(tab:exploObjTimeEffectNot6B) and @(tab:exploObjTimeEffect6B).

To test the null hypothesis of no temporal variation, we will perform likelihood ratio tests comparing the 2 models to the respective models without the time variable.

As for the secondary analysis for natural, vaccine-type carriage, we need to point out that there may be far too few natural carriage events to fit the natural carriage model. Also the study was not powered to investigate temporal investigations, so we will report results mostly descriptively and for the purpose of hypothesis generation.

firstVisitC<-min(ymd(simDat$visitC_date))
simDat<-simDat %>%
  dplyr::mutate(timeVisitC_sinceStudyStart=as.numeric(ymd(visitC_date)-firstVisitC))

tmp<-rcs(simDat$timeVisitC_sinceStudyStart) # need to manually add the spline terms as no direct support by logbin
simDat<-simDat %>%
  dplyr::mutate(
    timeVisitC_rcs1=as.numeric(tmp[,1]),
    timeVisitC_rcs2=as.numeric(tmp[,2]),
    timeVisitC_rcs3=as.numeric(tmp[,3]),
    timeVisitC_rcs4=as.numeric(tmp[,4])
  )
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))

modTimeNot6B<-logbin::logbin(carriageNot6B~VaccName+doseGroup+timeVisitC_rcs1+timeVisitC_rcs2+timeVisitC_rcs3+timeVisitC_rcs4,data=simDat,start=c(coef(mod),rep(0,4)),method="ab") # if boundary issues with EM algorithm, we will use method="glm2" or method="ab"

pGlm<-summary(modTimeNot6B)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
pLrt<-lmtest::lrtest(modTimeNot6B,modNot6B)$`Pr(>Chisq)`[2]
rrGlmVacc<-exp(c(summary(modTimeNot6B)$coefficients["VaccNamePCV-13","Estimate"],confint(modTimeNot6B)["VaccNamePCV-13",]))

modRes<-summary(modTimeNot6B)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(modTimeNot6B)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(modTimeNot6B)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(modTimeNot6B)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(modTimeNot6B)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
      rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs1"~"Time since study start",
      rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs2"~"Time since study start'",
      rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs3"~"Time since study start''",
      rownames(summary(modTimeNot6B)$coefficients)=="timeVisitC_rcs4"~"Time since study start'''"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit for natural, vaccine-type carriage with terms for time since study start. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). There is ",ifelse(pLrt<0.05,"a","no")," statistically significant effect of time since study start on the probability of natural carriage."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.30: Summary of the log-binomial regression model fit for natural, vaccine-type carriage with terms for time since study start. The baseline natural carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 33%, 95% CI ( 5.3%, 204.3%). There is no statistically significant effect of PCV-13 vaccine on the probability of natural carriage (RR 1.10, 95% CI (0.79, 1.51), p = 0.575). There is a statistically significant effect of time since study start on the probability of natural carriage.
Estimate Std. error Z statistic p-value
(Intercept) -1.108 0.930 -1.192 0.233
PCV-13 vaccine 0.092 0.164 0.560 0.575
Dose: CFU 80,000 0.252 0.744 0.338 0.735
Dose: CFU 20,000 0.309 0.852 0.362 0.717
Time since study start -0.002 0.004 -0.517 0.605
Time since study start’ 0.004 0.008 0.513 0.608
Time since study start’’ 0.012 0.110 0.106 0.916
Time since study start’’’ -0.142 0.317 -0.449 0.653
simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))

modTime6B<-logbin::logbin(carriage~VaccName+doseGroup+timeVisitC_rcs1+timeVisitC_rcs2+timeVisitC_rcs3+timeVisitC_rcs4,data=simDat,start=c(coef(mod),rep(0,4)),method="ab")

pGlm<-summary(modTime6B)$coefficients["VaccNamePCV-13","Pr(>|z|)"]
pLrt<-lmtest::lrtest(modTime6B,mod)$`Pr(>Chisq)`[2]
rrGlmVacc<-exp(c(summary(modTime6B)$coefficients["VaccNamePCV-13","Estimate"],confint(modTime6B)["VaccNamePCV-13",]))

modRes<-summary(modTime6B)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(modTime6B)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(modTime6B)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(modTime6B)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(modTime6B)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
      rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs1"~"Time since study start",
      rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs2"~"Time since study start'",
      rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs3"~"Time since study start''",
      rownames(summary(modTime6B)$coefficients)=="timeVisitC_rcs4"~"Time since study start'''"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit for experimental serotype 6B carriage with terms for time since study start. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),"). There is ",ifelse(pLrt<0.05,"a","no")," statistically significant effect of time since study start on the probability of experimental serotype 6B carriage."),digits=3) %>%
  kable_styling(full_width = FALSE)
Table 4.31: Summary of the log-binomial regression model fit for experimental serotype 6B carriage with terms for time since study start. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is 490%, 95% CI ( 1.0%, 237907.7%). There is a statistically significant effect of PCV-13 vaccine on the probability of serotype 6B carriage (RR 0.39, 95% CI (0.20, 0.74), p = 0.004). There is a statistically significant effect of time since study start on the probability of experimental serotype 6B carriage.
Estimate Std. error Z statistic p-value
(Intercept) 1.589 3.156 0.503 0.615
PCV-13 vaccine -0.948 0.332 -2.853 0.004
Dose: CFU 80,000 0.523 1.222 0.428 0.669
Dose: CFU 20,000 -3.406 2.910 -1.170 0.242
Time since study start -0.021 0.020 -1.070 0.285
Time since study start’ 0.025 0.029 0.847 0.397
Time since study start’’ -0.043 0.228 -0.186 0.852
Time since study start’’’ -0.157 0.540 -0.291 0.771

4.2.4.3 Time between vaccination (visit B) and inoculation (visit D)

We provide histograms and tables of medians and inter-quartile ranges (overall, by study arm, by inoculation dose and by experimental carriage status) of the time between visits B and D to describe the variation and to explore whether there are differences between study arms and between carriers and non-carriers.

As there are no substantial differences in time between vaccination and inoculation visits between carriers and non-carriers (Table 4.32 and Figure 4.18), we did not include this variable in the main objective model for phase 1 as a sensitivity analysis.

simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 20000","CFU 80000","CFU 160000"))

simDat <- simDat %>%
  dplyr::mutate(
    timeBtoD=as.numeric(ymd(visitD_date)-ymd(visitB_date)) # visit C is 1 week = 7 days before visit D
  )

timeBtoD<-simDat %>%
  dplyr::summarise(
    median=median(timeBtoD),
    IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")")
  )
timeBtoD<-cbind("All",timeBtoD)
colnames(timeBtoD)<-c("group","median","IQR")

timeBtoD_vacc<-simDat %>%
  dplyr::group_by(VaccName) %>%
  dplyr::summarise(
    median=median(timeBtoD),
    IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")"),
    .groups="drop"
  )
colnames(timeBtoD_vacc)<-c("group","median","IQR")
pVacc<-wilcox.test(timeBtoD~VaccName,data=simDat)$p.value

timeBtoD_dose<-simDat %>%
  dplyr::group_by(doseGroup) %>%
  dplyr::summarise(
    median=median(timeBtoD),
    IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")"),
    .groups="drop"
  )
colnames(timeBtoD_dose)<-c("group","median","IQR")
pDose<-kruskal.test(timeBtoD~doseGroup,data=simDat)$p.value

timeBtoD_carr<-simDat %>%
  dplyr::group_by(carriage) %>%
  dplyr::summarise(
    median=median(timeBtoD),
    IQR=paste(sep="","(",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.25))),",",format(nsmall=2,round(digits=2,quantile(timeBtoD,probs=0.75))),")"),
    .groups="drop"
  )
colnames(timeBtoD_carr)<-c("group","median","IQR")
pCarr<-wilcox.test(timeBtoD~carriage,data=simDat)$p.value
timeBtoD_carr[,1]<-case_when(as.logical(timeBtoD_carr[,1]==0)~"No",as.logical(timeBtoD_carr[,1]==1)~"Yes")

timeBtoD_fullTab<-rbind(timeBtoD,timeBtoD_vacc,timeBtoD_dose,timeBtoD_carr)

timeBtoD_fullTab %>%
  knitr::kable(col.names=c(" ","Median","IQR"),digits=2,caption="Median and inter-quartile range for the time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.") %>%
  kableExtra::kable_styling(full_width=FALSE) %>%
  kableExtra::pack_rows(group_label="Overall",start=1,end=1) %>%
  kableExtra::pack_rows(group_label=paste(sep="","Study Arm (Mann-Whitney-Wilcoxon p = ",format(nsmall=4,round(digits=4,pVacc)),")."),start=2,end=3) %>%
  kableExtra::pack_rows(group_label=paste(sep="","Inoculation dose (Kruskal-Wallis p = ",format(nsmall=4,round(digits=4,pDose)),")."),start=4,end=6) %>%
  kableExtra::pack_rows(group_label=paste(sep="","Experimental 6B carriage (Mann-Whitney-Wilcoxon p = ",format(nsmall=4,round(digits=4,pCarr)),")."),start=7,end=8)
Table 4.32: Median and inter-quartile range for the time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.
Median IQR
Overall
All 53.0 (40.00,62.00)
Study Arm (Mann-Whitney-Wilcoxon p = 0.7709).
Saline 48.0 (40.00,62.75)
PCV-13 54.0 (41.00,62.00)
Inoculation dose (Kruskal-Wallis p = 0.0000).
CFU 20000 28.0 (27.00,113.75)
CFU 80000 44.5 (40.00,53.75)
CFU 160000 60.5 (53.25,67.75)
Experimental 6B carriage (Mann-Whitney-Wilcoxon p = 0.2441).
No 54.0 (40.00,62.00)
Yes 46.5 (40.75,58.00)
g1<-simDat %>%
  ggplot(mapping=aes(x=timeBtoD)) +
  geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
  theme_light() +
  xlab("") +
  ylab("Frequency") +
  labs(title="All participants")

g2<-simDat %>%
  ggplot(mapping=aes(x=timeBtoD,fill=VaccName)) +
  geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
  theme_light() +
  scale_fill_manual(values=c("steelblue","orange")) +
  xlab("") +
  ylab("Frequency") +
  labs(title="By study arm.") +
  guides(fill="none") +
  facet_wrap(~VaccName)

g3<-simDat %>%
  ggplot(mapping=aes(x=timeBtoD,fill=doseGroup)) +
  geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
  theme_light() +
  scale_fill_manual(values=c("grey80","grey50","grey20")) +
  xlab("") +
  ylab("Frequency") +
  labs(title="By inoculation dose.") +
  guides(fill="none") +
  facet_wrap(~doseGroup)

g4<-simDat %>%
  ggplot(mapping=aes(x=timeBtoD,fill=ifelse(carriage==1,"Experimental carriage.","No experimental carriage"))) +
  geom_histogram(bins=length(unique(simDat$timeBtoD)),col="white") +
  theme_light() +
  scale_fill_manual(values=c("grey40","grey20")) +
  xlab("Time between visits B and D (days)") +
  ylab("Frequency") +
  labs(title="By experimental 6B carriage.") +
  guides(fill="none") +
  facet_wrap(~ifelse(carriage==1,"Experimental carriage.","No experimental carriage"))

grid.arrange(g1,g2,g3,g4,nrow=4)
Time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.

Figure 4.18: Time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.

4.2.4.4 Community vs. student participants

MARVELS PCV-13 recruited both among healthy general population members (“community”) and university students. As an exploratory analysis we will compare the carriage rates between both sets of participants. Specifically, we will add a term for participant type (community or student) \(X_{type}\) as predictor in the main log-binomial regression model and report its estimate and 95% confidence interval:

\[ \log(\pi) = \log\left(E[Y|X_{20,000},X_{80,000},X_{PCV13}]\right) = \beta_0 + \beta_{20,000}\cdot X_{20,000} + \beta_{80,000}\cdot X_{80,000} + \beta_{PCV13}\cdot X_{PCV13} + \beta_{type}\cdot X_{type} \]

simDat$doseGroup<-factor(as.character(simDat$doseGroup),levels=c("CFU 160000","CFU 80000","CFU 20000"))
modTime6B.comstud<-logbin::logbin(carriage~VaccName+doseGroup+student,data=simDat,)

pGlm<-summary(modTime6B.comstud)$coefficients["student","Pr(>|z|)"]
rrGlmVacc<-exp(c(summary(modTime6B.comstud)$coefficients["student","Estimate"],confint(modTime6B.comstud)["student",]))

modRes<-summary(modTime6B.comstud)$coefficients %>%
  as.data.frame() %>%
  mutate(
    parameter=case_when(
      rownames(summary(modTime6B.comstud)$coefficients)=="(Intercept)"~"(Intercept)",
      rownames(summary(modTime6B.comstud)$coefficients)=="VaccNamePCV-13"~"PCV-13 vaccine",
      rownames(summary(modTime6B.comstud)$coefficients)=="doseGroupCFU 20000"~"Dose: CFU 20,000",
      rownames(summary(modTime6B.comstud)$coefficients)=="doseGroupCFU 80000"~"Dose: CFU 80,000",
      rownames(summary(modTime6B.comstud)$coefficients)=="student"~"Student"
    )
  )

colnames(modRes)<-c("Estimate","Std.err","z.value","p.value","parameter")

modRes %>%
  dplyr::select(parameter,Estimate,Std.err,z.value,p.value) %>%
  mutate(p.value=ifelse(parameter!="(Intercept)" & p.value<0.05,cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=T),cell_spec(format(nsmall=3,round(digits=3,p.value)),bold=F))) %>%
  kable(row.names = FALSE,col.names=c("","Estimate","Std. error","Z statistic","p-value"),escape=FALSE,caption=paste(sep="","Summary of the log-binomial regression model fit for experimental serotype 6B carriage with a term for student or community member. The baseline carriage risk for a saline-randomised participant inoculated at the CFU 160,000 dose is ",format(digits=1,round(digits=1,100*exp(modRes["(Intercept)","Estimate"]))),"%, 95% CI (",paste(collapse="%, ",format(nsmall=1,round(digits=1,100*exp(c(modRes["(Intercept)","Estimate"]-qnorm(0.975)*modRes["(Intercept)","Std.err"],modRes["(Intercept)","Estimate"]+qnorm(0.975)*modRes["(Intercept)","Std.err"]))))),"%). There is ",ifelse(pGlm<0.05,"a","no")," statistically significant effect of student / community membership on the probability of serotype 6B carriage (RR of carriage in students compared to community members ",format(nsmall=2,round(digits=2,rrGlmVacc[1])),", 95% CI (",paste(collapse=", ",format(nsmall=2,round(digits=2,rrGlmVacc[2:3]))),"), p ",ifelse(pGlm>=0.001,paste(sep="","= ",format(nsmall=3,round(digits=3,pGlm))),"< 0.001"),")."),digits=3) %>%
  kable_styling(full_width = FALSE)

4.3 Phase 2

Not done yet.

4.4 Safety data

We list all reported adverse events (AEs) and serious adverse events (SAEs). As per protocol, no further differentiation into reactions or unexpected events will be made as very few AEs and SAEs are expected.

As there are not enough events, we will not compare the frequency of events by type across study arms and inoculation doses.

[to be compiled]

safeDat<-data.frame(
  PID=sample(x=simDat$PID,replace=TRUE,size=20),
  VaccName=NA,
  dose=NA,
  typeOfEvent=sample(x=c("AE","SAE"),size=20,replace=TRUE,prob=c(0.9,0.1)),
  daysSinceVaccination=sample(x=1:70,size=20,replace=TRUE),
  daysSinceInoculation=sample(x=c(NA,0:14),size=20,prob=c(0.05,0.15,0.22,0.18,0.1,0.08,0.04,rep(0.02,9)),replace=TRUE), # NA means before inoculation
  description="Short description of event.",
  treatment="Summary of treatments given, if any.",
  notes="Any other relevant observations."
)

safeDat<-safeDat %>%
  dplyr::mutate(
    VaccName=simDat$VaccName[match(PID,simDat$PID)],
    dose=simDat$dose[match(PID,simDat$PID)],
    daysSinceVaccination=daysSinceVaccination+daysSinceInoculation
  )

safeDat<-safeDat[order(safeDat$dose,safeDat$VaccName,safeDat$PID),]

safeDat %>%
  knitr::kable(caption="(DUMMY TABLE) Listing of all clinical events reported during the study.") %>%
  kableExtra::kable_styling(full_width = FALSE)

simDat<-simDat %>%
  dplyr::mutate(
    aeOrSae=ifelse(PID %in% safeDat$PID,1,0),
    ae=ifelse(PID %in% safeDat$PID[safeDat$typeOfEvent=="AE"],1,0),
    sae=ifelse(PID %in% safeDat$PID[safeDat$typeOfEvent=="SAE"],1,0),
  )
tabAeSaeArm<-table(simDat$VaccName,simDat$aeOrSae)
tabAeSaeDose<-table(simDat$dose,simDat$aeOrSae)
tabAeArm<-table(simDat$VaccName,simDat$ae)
tabAeDose<-table(simDat$dose,simDat$ae)
tabSaeArm<-table(simDat$VaccName,simDat$sae)
tabSaeDose<-table(simDat$dose,simDat$sae)

tabFull<-rbind(tabAeSaeArm,tabAeSaeDose,tabAeArm,tabAeDose,tabSaeArm,tabSaeDose)

pFishAeSaeArm<-fisher.test(tabAeSaeArm)$p.value
pFishAeSaeArm<-ifelse(pFishAeSaeArm>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishAeSaeArm)))," < 0.001")
pFishAeSaeDose<-fisher.test(tabAeSaeDose)$p.value
pFishAeSaeDose<-ifelse(pFishAeSaeDose>=0.001,paste(sep=""," = ",format(nsmall=3,round(digits=3,pFishAeSaeDose)))," < 0.001")

tabFull %>%
  kable(caption=paste(sep="","Safety events tabulated by study arm and inoculation dose."),col.names=c("No event","Event"),row.names=T) %>%
  kable_styling(full_width = F) %>%
  pack_rows(paste(sep="","Any event (AE or SAE) by study arm (Fisher test p",pFishAeSaeArm,")"),1,2) %>%
  pack_rows(paste(sep="","Any event (AE or SAE) by inoculation dose (Fisher test p",pFishAeSaeDose,")"),3,5) %>%
  pack_rows(paste(sep="","AEs by study arm"),6,7) %>%
  pack_rows(paste(sep="","AEs by inoculation dose"),8,10) %>%
    pack_rows(paste(sep="","SAEs by study arm"),11,12) %>%
  pack_rows(paste(sep="","SAEs by inoculation dose"),13,15)

5 List of figures

Figure 3.1: Number of nasal wash samples acquired per visit as of 07 October 2022.

Figure 3.2: CFU concentrations over the study period.

Figure 4.1: CONSORT flow diagram.

Figure 4.2: Serotype 6B carriage proportions per study arm and inoculation dose.

Figure 4.3: Natural pneumococcal carriage by study arm and visit.

Figure 4.4: Natural pneumococcal carriage by study arm and visit, stratified by whether carriage was vaccine-type or not.

Figure 4.5: LytA PCR derived Serotype 6B carriage proportions per study arm and inoculation dose.

Figure 4.6: Scatterplot of logged culture and LytA PCR measured carriage densities.

Figure 4.7: Comparison of culture derived density measurements between culture positive, PCR negative and culture positive, PCR positive samples.

Figure 4.8: Serotype 6B carriage proportions per study arm and inoculation dose at the different visits.

Figure 4.9: Median carriage densities at the different study visits, stratified by study arm, serotype and inoculation dose.

Figure 4.10: Jitter and box plot of carriage density at the different study visits, stratified by serotype, study arm and inoculation dose.

Figure 4.11: Median LytA carriage densities at the different study visits, stratified by study arm and inoculation dose.

Figure 4.12: Jitter and box plot of LytA carriage density at the different study visits by study arm and inoculation group.

Figure 4.13: Mean carriage duration by serotype, study arm and visit of first carriage.

Figure 4.14: Total carriage density (tdAUC) box and jitter plots by study arm and inoculation dose.

Figure 4.15: Density time profiles by study arm and inoculation dose.

Figure 4.16: Boxplot, by target dose, of before and after inoculation dose measurements of the study inocula as well as the average.

Figure 4.17: Dose-response curve estimated from the data for participants randomised to the saline arm.

Figure 4.18: Time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.

#Figure \@ref(fig:secObjDensityComparisonScatter): Scatter plots of culture and lytA/6B density measurements by study arm, inoculation dose and serotype.

#Figure \@ref(fig:secObjDensityComparisonHist): Histogram of differences in log10(density + 1) measurements derived from culture and lytA/6B.

6 List of tables

Table 4.1: Participant characteristics.

Table 4.2: Carriage status by study arm (for each inoculation dose and overall).

Table 4.3: Summary of the log-binomial regression model fit.

Table 4.4: Tabulated counts and percentages of experimental 6B carriers stratified by natural carriage (at any of visits C, E, F, G) and study arm.

Table 4.5: Tabulated counts and percentages of experimental 6B carriers stratified by natural carriage (at visit C) and study arm.

Table 4.6: Tabulated counts and percentages of experimental carriers stratified by randomisation arm and natural carriage (with risk ratios for experimental carriage). Natural carriage is defined here as being a carrier of a non-6B strain at visit C.

Table 4.7: Summary of the log-binomial regression model fits, stratified by natural carriage status (at any of Visits C, E, F or G). The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.

Table 4.8: Summary of the log-binomial regression model fits, stratified by natural carriage status (at Visit C). The ‘Estimate’ column in the table below shows the baseline risk on the ‘(Intercept)’ row and the relative risk associated with each of the exposures on the subsequent rows.

Table 4.9: Summary of the log-binomial regression model fit, adjusting for natural carriage (any visit).

Table 4.10: Summary of the log-binomial regression model fit, adjusting for natural carriage (at Visit C).

Table 4.11: Culture vs LytA PCR results for 6B carriage at visit E.

Table 4.12: Culture vs LytA PCR results for 6B carriage at visit F.

Table 4.13: Culture vs LytA PCR results for 6B carriage at visit G.

Table 4.14: Comparison of culture and LytA PCR derived carriage status.

Table 4.15: Number of participants colonised with experimental S. Pneumoniae serotype 6B (as determined by LytA PCR) by study arm and inoculation dose.

Table 4.16: Summary of the log-binomial regression model fit (for the LytA PCR derived carriage outcome).

Table 4.17: Pearson and Spearman correlation coefficients calculated for the logged culture and LytA PCR derived density measurements.

Table 4.18: Results from the GEE model fit to carriage density data. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.19: Results from the GEE model fit to carriage density data from the CFU 20,000 inoculation dose group only. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.20: Results from the GEE model fit to carriage density data from the CFU 80,000 inoculation dose group only. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.21: Results from the GEE model fit to carriage density data from the CFU 160,000 inoculation dose group only. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.22: Results from the GEE model fit to LytA PCR density data. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.23: Results from the GEE model fitted to LytA PCR density data for the CFU 80,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.24: Results from the GEE model fitted to LytA PCR density data for the CFU 160,000 inoculation group. Model fitted only to data from participants who were carriers at the time of each given visit.

Table 4.25: Mann-Whitney-Wilcoxon test result for comparing tdAUC between study arms by inoculation group.

Table 4.26: Number of participants colonised with vaccine-type S. Pneumoniae by study arm.

Table 4.27: Summary of the log-binomial regression model fit for natural, vaccine-type carriage.

Table 4.28: Medians, inter-quartile ranges (IQRs) and ranges for the inoculum dose measurements before and after inoculation. Also shown are the summaries for the average measurements.

Table 4.29: Summary of the dose-response model fitted to data from participants randomised to saline.

Table 4.30: Summary of the log-binomial regression model fit for natural, vaccinet-type carriage with terms for time since study start.

Table 4.31: Summary of the log-binomial regression model fit for experimental serotype 6B carriage with terms for time since study start.

#Table \@ref(tab:phase2ModelSummary): Summary of the log-binomial regression model fit for phase 2 carriage (no term for prior carriage).

#Table \@ref(tab:phase2Tables2x2): Number of participants colonised with S. Pneumoniae during phase 2 by PCV-13 vaccination status and prior carriage, stratified by phase 1 inoculation dose.

#Table \@ref(tab:secObjPhase2): Summary of the log-binomial regression model fit for phase 2 carriage (with a term for natural carriage during phase 1).

Table 4.32: Median and inter-quartile range for the time between vaccination and inoculation visits overall, by study arm, inoculation dose and carriage status.

#Table \@ref(tab:safetyDataAllEvents): Listing of all clinical events reported during the study.

#Table \@ref(tab:safetyData2x2Tables): Safety events tabulated by study arm and inoculation dose.

References

Ritz, Christian, Florent Baty, Jens C. Streibig, and Daniel Gerhard. 2015. “Dose-Response Analysis Using R.” Edited by Yinglin Xia. PLOS ONE 10 (12): e0146021. https://doi.org/10.1371/journal.pone.0146021.
Ritz, Christian, Signe M. Jensen, Daniel Gerhard, and Jens C. Streibig. 2020. Dose-Response Analysis Using R. Chapman & Hall/CRC the R Series. Boca Raton: CRC Press, Taylor and Francis Group.